You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: llama-index-integrations/readers/llama-index-readers-web/llama_index/readers/web/firecrawl_web/README.md
+25-14Lines changed: 25 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,25 +6,31 @@
6
6
7
7
1.**Install Firecrawl Package**: Ensure the `firecrawl-py` package is installed to use the Firecrawl Web Loader. Install it via pip with the following command:
8
8
9
-
```bash
10
-
pip install firecrawl-py
11
-
```
9
+
```bash
10
+
pip install 'firecrawl-py>=4.3.3'
11
+
```
12
12
13
13
2.**API Key**: Secure an API key from [Firecrawl.dev](https://www.firecrawl.dev/) to access the Firecrawl services.
14
14
15
15
### Using Firecrawl Web Loader
16
16
17
-
-**Initialization**: Initialize the FireCrawlWebReader by providing the API key, the desired mode of operation (`crawl`, `scrape`, `search`, or `extract`), and any optional parameters for the Firecrawl API.
17
+
-**Initialization**: Initialize the `FireCrawlWebReader` by providing the API key, the desired mode of operation (`crawl`, `scrape`, `map`, `search`, or `extract`), and any optional parameters for the Firecrawl API.
18
18
19
-
```python
20
-
from llama_index.readers.web.firecrawl_web.base import FireCrawlWebReader
19
+
```python
20
+
from llama_index.readers.web.firecrawl_web.base import FireCrawlWebReader
21
21
22
-
firecrawl_reader = FireCrawlWebReader(
23
-
api_key="your_api_key_here",
24
-
mode="crawl", # or "scrape" or "search" or "extract"
25
-
params={"additional": "parameters"},
26
-
)
27
-
```
22
+
firecrawl_reader = FireCrawlWebReader(
23
+
api_key="your_api_key_here",
24
+
mode="crawl", # or "scrape" or "map" or "search" or "extract"
25
+
# Common params for the underlying Firecrawl client
26
+
# e.g. formats for content types and crawl limits
27
+
params={
28
+
"formats": ["markdown", "html"], # for scrape or crawl
29
+
"limit": 100, # for crawl
30
+
# "scrape_options": {"formats": ["markdown", "html"]}, # alternative shape for crawl
31
+
},
32
+
)
33
+
```
28
34
29
35
-**Loading Data**: To load data, use the `load_data` method with the URL you wish to process.
30
36
@@ -43,8 +49,13 @@ Here is an example demonstrating how to initialize the FireCrawlWebReader, load
43
49
# Initialize the FireCrawlWebReader with your API key and desired mode
44
50
firecrawl_reader = FireCrawlWebReader(
45
51
api_key="your_api_key_here", # Replace with your actual API key
46
-
mode="crawl", # Choose between "crawl", "scrape", "search" and "extract"
0 commit comments