Skip to content

Commit f70588a

Browse files
authored
update docs (#11)
1 parent ef83cde commit f70588a

File tree

4 files changed

+36
-33
lines changed

4 files changed

+36
-33
lines changed

README.md

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11

2-
## DocsScraper: "A document scraping and parsing tool used to create a custom RAG database for AIHelpMe.jl"
2+
## DocsScraper: "Efficient RAG knowledge pack creator from online Julia documentation"
33
[![Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://juliagenai.github.io/DocsScraper.jl/dev/) [![Build Status](https://github.com/JuliaGenAI/DocsScraper.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/JuliaGenAI/DocsScraper.jl/actions/workflows/CI.yml?query=branch%3Amain) [![Coverage](https://codecov.io/gh/JuliaGenAI/DocsScraper.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/JuliaGenAI/DocsScraper.jl) [![Aqua](https://raw.githubusercontent.com/JuliaTesting/Aqua.jl/master/badge.svg)](https://github.com/JuliaTesting/Aqua.jl)
44

55

@@ -15,27 +15,27 @@ It scrapes and parses the URLs and with the help of PromptingTools.jl, creates a
1515

1616
## Installation
1717

18-
To install DocsScraper, use the Julia package manager and the package name:
18+
To install DocsScraper, use the Julia package manager and the package name (it's not registered yet):
1919

2020
```julia
2121
using Pkg
22-
Pkg.add("DocsScraper")
22+
Pkg.add(url="https://github.com/JuliaGenAI/DocsScraper.jl")
2323
```
2424

2525

2626
**Prerequisites:**
2727

2828
- Julia (version 1.10 or later).
2929
- Internet connection for API access.
30-
- OpenAI API keys with available credits. See [How to Obtain API Keys](#how-to-obtain-api-keys).
30+
- OpenAI API keys with available credits. See [How to Obtain API Keys](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions#Creating-OpenAI-API-Key).
3131

3232

3333
## Building the Index
3434
```julia
3535
crawlable_urls = ["https://juliagenai.github.io/DocsScraper.jl/dev/home/"]
3636

3737
index_path = make_knowledge_packs(crawlable_urls;
38-
index_name = "docsscraper", embedding_dimension = 1024, embedding_bool = true, target_path=joinpath(pwd(), "knowledge_packs"))
38+
index_name = "docsscraper", embedding_dimension = 1024, embedding_bool = true, target_path="knowledge_packs")
3939
```
4040
```julia
4141
[ Info: robots.txt unavailable for https://juliagenai.github.io:/DocsScraper.jl/dev/home/
@@ -73,14 +73,12 @@ a docsscraper__v20240823__textembedding3large-1024-Bool__v1.0.hdf5
7373
7474
```julia
7575
using AIHelpMe
76+
using AIHelpMe: pprint, load_index!
7677

77-
# Either use the index explicitly
78-
aihelp(index_path, "what is DocsScraper.jl?")
78+
# set it as the "default" index, then it will be automatically used for every question
79+
load_index!(index_path)
7980

80-
# or set it as the "default" index, then it will be automatically used for every question
81-
AIHelpMe.load_index!(index_path)
82-
83-
pprint(aihelp("what is DocsScraper.jl?"))
81+
aihelp("what is DocsScraper.jl?") |> pprint
8482
```
8583
```julia
8684
[ Info: Updated RAG pipeline to `:bronze` (Configuration key: "textembedding3large-1024-Bool").
@@ -96,8 +94,9 @@ PromptingTools.jl, creates a vector store that can be utilized in RAG (Retrieval
9694
AIHelpMe.jl and PromptingTools.jl to provide efficient and relevant query retrieval, ensuring that the responses generated by the system are specific to the content in the created database.
9795
```
9896
99-
Tip: Use `pprint` for nicer outputs with sources
97+
Tip: Use `pprint` for nicer outputs with sources and `last_result` for more detailed outputs (with sources).
10098
```julia
101-
using AIHelpMe: pprint, last_result
102-
print(last_result)
99+
using AIHelpMe: last_result
100+
# last_result() returns the last result from the RAG pipeline, ie, same as running aihelp(; return_all=true)
101+
print(last_result())
103102
```

docs/src/index.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11

2-
## DocsScraper: "A document scraping and parsing tool used to create a custom RAG database for AIHelpMe.jl"
2+
# DocsScraper
3+
34
DocsScraper is a package designed to create "knowledge packs" from online documentation sites for the Julia language.
45

56
It scrapes and parses the URLs and with the help of PromptingTools.jl, creates an index of chunks and their embeddings that can be used in RAG applications. It integrates with AIHelpMe.jl and PromptingTools.jl to offer highly efficient and relevant query retrieval, ensuring that the responses generated by the system are specific to the content in the created database.
@@ -12,19 +13,19 @@ It scrapes and parses the URLs and with the help of PromptingTools.jl, creates a
1213

1314
## Installation
1415

15-
To install DocsScraper, use the Julia package manager and the package name:
16+
To install DocsScraper, use the Julia package manager and the package name (it's not registered yet):
1617

1718
```julia
1819
using Pkg
19-
Pkg.add("DocsScraper")
20+
Pkg.add(url="https://github.com/JuliaGenAI/DocsScraper.jl")
2021
```
2122

2223

2324
**Prerequisites:**
2425

2526
- Julia (version 1.10 or later).
2627
- Internet connection for API access.
27-
- OpenAI API keys with available credits. See [How to Obtain API Keys](#how-to-obtain-api-keys).
28+
- OpenAI API keys with available credits. See [How to Obtain API Keys](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions#Creating-OpenAI-API-Key).
2829

2930

3031
## Building the Index
@@ -70,14 +71,12 @@ a docsscraper__v20240823__textembedding3large-1024-Bool__v1.0.hdf5
7071
7172
```julia
7273
using AIHelpMe
74+
using AIHelpMe: pprint, load_index!
7375

74-
# Either use the index explicitly
75-
aihelp(index_path, "what is DocsScraper.jl?")
76-
77-
# or set it as the "default" index, then it will be automatically used for every question
78-
AIHelpMe.load_index!(index_path)
76+
# set it as the "default" index, then it will be automatically used for every question
77+
load_index!(index_path)
7978

80-
pprint(aihelp("what is DocsScraper.jl?"))
79+
aihelp("what is DocsScraper.jl?") |> pprint
8180
```
8281
```julia
8382
[ Info: Updated RAG pipeline to `:bronze` (Configuration key: "textembedding3large-1024-Bool").
@@ -93,8 +92,8 @@ PromptingTools.jl, creates a vector store that can be utilized in RAG (Retrieval
9392
AIHelpMe.jl and PromptingTools.jl to provide efficient and relevant query retrieval, ensuring that the responses generated by the system are specific to the content in the created database.
9493
```
9594
96-
Tip: Use `pprint` for nicer outputs with sources
95+
Tip: Use `pprint` for nicer outputs with sources and `last_result` for more detailed outputs (with sources).
9796
```julia
98-
using AIHelpMe: pprint, last_result
99-
print(last_result)
97+
using AIHelpMe: last_result
98+
print(last_result())
10099
```

docs/src/working.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

examples/scripts/using_with_AIHelpMe.jl

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,22 @@ Pkg.add(url = "https://github.com/JuliaGenAI/DocsScraper.jl/")
44
Pkg.add("AIHelpMe")
55
using DocsScraper
66
using AIHelpMe
7-
using AIHelpMe: pprint
7+
using AIHelpMe: pprint, last_result
88

99
# Creating the index
1010
crawlable_urls = ["https://juliagenai.github.io/DocsScraper.jl/dev/home/"]
1111
index_path = make_knowledge_packs(crawlable_urls;
1212
index_name = "docsscraper", embedding_dimension = 1024, embedding_bool = true,
13-
target_path = joinpath(pwd(), "knowledge_packs"))
13+
target_path = "knowledge_packs")
1414

15-
# Using the index with AIHelpMe
15+
# Using the index with AIHelpMe, load it as the default index
1616
AIHelpMe.load_index!(index_path)
1717

18-
pprint(aihelp("what is DocsScraper.jl?"))
19-
pprint(aihelp("how do I install DocsScraper?"))
18+
# Ask questions // pprint is optional
19+
aihelp("what is DocsScraper.jl?") |> pprint
20+
21+
aihelp("how do I install DocsScraper?") |> pprint
22+
23+
# Get more detailed outputs with sources for the last answer
24+
# Identical to running aihelp(; return_all=true)
25+
last_result() |> pprint

0 commit comments

Comments
 (0)