Skip to content

Commit 0cacfe9

Browse files
bmunkholmamotl
authored andcommitted
Data modelling: Fix page about "full-text data"
1 parent ad34ee3 commit 0cacfe9

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

docs/start/modelling/fulltext.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
CrateDB features **native full‑text search** powered by **Apache Lucene** and Okapi BM25 ranking, fully accessible via SQL. You can blend this seamlessly with other data types—JSON, time‑series, geospatial, vectors and more—all in a single SQL query platform.
44

5-
## 1. Data Types & Indexing Strategy
5+
## Data Types & Indexing Strategy
66

77
* By default, all text columns are indexed as `plain` (raw, unanalyzed)—efficient for equality search but not suitable for full‑text queries
88
* To enable full‑text search, you must define a **FULLTEXT index** with an optional language **analyzer**, e.g.:
@@ -21,7 +21,7 @@ CREATE TABLE documents (
2121
INDEX ft_all USING FULLTEXT(title, body) WITH (analyzer = 'english');
2222
```
2323

24-
## 2. Index Design & Custom Analyzers
24+
## Index Design & Custom Analyzers
2525

2626
| Component | Purpose |
2727
| ----------------- | ---------------------------------------------------------------------------- |
@@ -48,7 +48,7 @@ CREATE ANALYZER german_snowball
4848
WITH (language = 'german');
4949
```
5050

51-
## 3. Querying: MATCH Predicate & Scoring
51+
## Querying: MATCH Predicate & Scoring
5252

5353
CrateDB uses the SQL `MATCH` predicate to run full‑text queries against full‑text indices. It optionally returns a relevance score `_score`, ranked via BM25.
5454

@@ -100,7 +100,7 @@ WHERE MATCH((ft_en, ft_de), 'jupm OR verwrlost') USING best_fields WITH (fuzzine
100100
ORDER BY _score DESC;
101101
```
102102

103-
## 4. Use Cases & Integration
103+
## Use Cases & Integration
104104

105105
CrateDB is ideal for searching **semi-structured large text data**—product catalogs, article archives, user-generated content, descriptions and logs.
106106

@@ -119,13 +119,13 @@ WHERE
119119

120120
This blend lets you query by text relevance, numeric filters, and spatial constraints, all in one.
121121

122-
## 5. Architectural Strengths
122+
## Architectural Strengths
123123

124124
* **Built on Lucene inverted index + BM25**, offering relevance ranking comparable to search engines.
125125
* **Scale horizontally across clusters**, while maintaining fast indexing and search even on high volume datasets.
126126
* **Integrated SQL interface**: eliminates need for separate search services like Elasticsearch or Solr.
127127

128-
## 6. Best Practices Checklist
128+
## Best Practices Checklist
129129

130130
| Topic | Recommendation |
131131
| ------------------- | ---------------------------------------------------------------------------------- |
@@ -138,13 +138,13 @@ This blend lets you query by text relevance, numeric filters, and spatial constr
138138
| Multi-model Queries | Combine full-text search with geo, JSON, numerical filters. |
139139
| Analyze Limitations | Understand phrase\_prefix caveats at scale; tune analyzer/tokenizer appropriately. |
140140

141-
## 7. Further Learning & Resources
141+
## Further Learning & Resources
142142

143143
* **CrateDB Full‑Text Search Guide**: details index creation, analyzers, MATCH usage.
144144
* **FTS Options & Advanced Features**: fuzziness, synonyms, multi-language idioms.
145145
* **Hands‑On Academy Course**: explore FTS on real datasets (e.g. Chicago neighborhoods).
146146
* **CrateDB Community Insights**: real‑world advice and experiences from users.
147147

148-
## **8. Summary**
148+
## **Summary**
149149

150150
CrateDB combines powerful Lucene‑based full‑text search capabilities with SQL, making it easy to model and query textual data at scale. It supports fuzzy matching, multi-language analysis, composite indexing, and integrates fully with other data types for rich, multi-model queries. Whether you're building document search, catalog lookup, or content analytics—CrateDB offers a flexible and scalable foundation.\

0 commit comments

Comments
 (0)