Skip to content

Commit dcaf515

Browse files
ckunkitkilias
andauthored
Feature/440 re enabled parquet import in notebook first steps.ipynb (#445)
* Updated graphic overview * #441: Added section "Getting Started" to User Guide * Removed notes about pyexasol version < 1 and DB version required for Parquet import which only applies when using the SQL IMPORT statement. Co-authored-by: Torsten Kilias <[email protected]>
1 parent 992746f commit dcaf515

File tree

6 files changed

+35
-39
lines changed

6 files changed

+35
-39
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,20 @@
22

33
The Exasol AI Lab is a pre-configured container designed to empower data scientists. It streamlines common data science and AI tasks, including data loading, preparation, exploration, model training, and deployment. Whether you’re a seasoned practitioner or just getting started, the AI Lab provides a hassle-free experience.
44

5-
![Transformers Extension](https://github.com/exasol/ai-lab/blob/4.0.0/doc/user_guide/ai-lab-screenshot.png)
5+
![Screenshot Transformers Extension](https://github.com/exasol/ai-lab/blob/4.0.0/doc/user_guide/ai-lab-screenshot.png)
66

77
Key Features:
8-
* Jupyter Notebook Environment: The heart of the AI Lab is a robust Jupyter Notebook environment. It’s where you’ll work on your data science projects.
9-
* Exasol Integration: Leverage Exasol’s power for your AI and machine learning use cases. The AI Lab includes essential Exasol packages, extensions, and configuration tasks.
10-
* Example Notebooks: Jumpstart your work with ready-to-use example notebooks. Explore classic machine learning scenarios (think scikit-learn), seamlessly integrate Exasol with AWS SageMaker, and tap into Hugging Face models directly within Exasol.
8+
* **Jupyter Notebook Environment**: The heart of the AI Lab is a robust Jupyter Notebook environment. It is where you will work on your AI and Data Science projects.
9+
* **Exasol Integration**: Leverage Exasol’s power for your AI and machine learning use cases. The AI Lab includes essential Exasol packages, extensions, and configuration tasks.
10+
* **Example Notebooks**: Jumpstart your work with ready-to-use example notebooks. Explore classic machine learning scenarios (think scikit-learn), seamlessly integrate Exasol with AWS SageMaker, and tap into Hugging Face models directly within Exasol.
1111

12-
Feel free to explore the Exasol AI Lab and unleash your data science potential!
12+
## Getting Started
1313

14-
The AI Lab is available in multiple [Editions](https://github.com/exasol/ai-lab/blob/4.0.0/doc/user_guide/editions.md) involving different technology stacks, see also common [System Requirements](https://github.com/exasol/ai-lab/blob/4.0.0/doc/user_guide/system-requirements.md).
14+
Feel free to explore the Exasol AI Lab and unleash your AI potential!
1515

16-
After downloading the required files and having started the AI Lab you can connect to AI Lab's [Jupyter Service](https://github.com/exasol/ai-lab/blob/4.0.0/doc/user_guide/jupyter.md).
16+
The AI Lab is available in **multiple editions**. Please visit the [User Guide](doc/user_guide/user-guide.md) and pick
17+
the one that is most convenient to your preferences.
1718

1819
## Additional Links
1920

20-
* [Troubleshooting](doc/user_guide/troubleshooting.md)
2121
* [Developer Guide](https://github.com/exasol/ai-lab/blob/4.0.0/doc/developer_guide/developer_guide.md)

doc/changes/changes_4.1.0.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,15 @@ Code name:
44

55
## Summary
66

7+
## Features
8+
9+
* #440: Re-enabled Parquet import in notebook `first_steps.ipynb`
10+
711
## Documentation
812

913
* #432: Fixed structure and links in notebook first-steps
1014
* #337: Ensured correct spelling for AI Lab
15+
* #441: Added section "Getting Started" to User Guide
1116

1217
## Features
1318

doc/user_guide/docker/docker-usage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ Additional options
5959
* Additional [Limitations and security risks](prerequisites.md#enabling-exasol-ai-lab-to-use-docker-features) apply.
6060
* Only file system objects on the daemon machine can be mounted. This applies to ordinary directories as well as the `docker.sock`.
6161
* On Windows mounting `docker.sock` only works with Docker Desktop with WSL 2.
62-
* You can mount the Docker Socket with `--volume /var/run/docker.sock:/var/run/docker.sock`
62+
* You can mount the Docker Socket with `--volume /var/run/docker.sock:/var/run/docker.sock`, see [graphical overview](prerequisites.md).
6363

6464
The following example uses all additional options:
6565

doc/user_guide/docker/docker.png

61.5 KB
Loading
Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# AI Lab Editions
1+
# AI Lab User Guide
22

33
Exasol AI Lab is available in multiple editions as shown in the following table.
44

@@ -9,16 +9,17 @@ Recommendations
99
* In case a Docker client is available on your system then probably the Docker Edition is the best choice.
1010
* When you want to use the VM Edition then select an appropriate VM image format depending on the Hypervisor software available on your system.
1111

12-
13-
| Description | Format(s) |
12+
| Edition | Image(s) |
1413
|---------------------------------------------------|--------------------------------------------------------------------------|
1514
| [AMI Edition](ami-usage.md) | Amazon Machine Image (AMI) |
1615
| [Docker Edition](docker/docker-usage.md) | Docker Image |
1716
| [Virtual Machine Edition](vm-edition/vm-usage.md) | VMware Virtual Machine Disk (VMDK), Virtual Hard Disk by Microsoft (VHD) |
1817

19-
Each of the editions is associated with an _image_ in a specific format.
20-
21-
The image contains all necessary dependencies and provides a running instance of Jupyterlab which is automatically started when booting or running the image.
18+
Each image contains all necessary dependencies and automatically launches Jupyterlab when booting or running the image.
2219

23-
For download go to the [release notes](https://github.com/exasol/ai-lab/releases/latest).
20+
Please
21+
* [Download](https://github.com/exasol/ai-lab/releases/latest) your favorite image from the latest AI Lab release on GitHub,
22+
* Run the AI Lab as described in the resp. documentation for the edition, and
23+
* Connect to AI Lab's [Jupyter Service](https://github.com/exasol/ai-lab/blob/4.0.0/doc/user_guide/jupyter.md).
2424

25+
In case of problems, please refer to our [Troubleshooter](troubleshooting.md).

exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/first_steps.ipynb

Lines changed: 13 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -256,9 +256,7 @@
256256
"id": "0dc80445-e5c0-43c1-ae34-f16326d6fb94",
257257
"metadata": {},
258258
"source": [
259-
"**Please note**: \n",
260-
"* Parquet import requires using **<span style=\"color: #40a\">Exasol version 2025.\\* or higher</span>**, see [docs.exasol.com](https://docs.exasol.com/db/latest/loading_data/load_data_parquet.htm).\n",
261-
"* Hence, the import is currently commented out.\n",
259+
"**Please note**: Parquet import via SQL statement `IMPORT` requires using **<span style=\"color: #40a\">Exasol version 2025.\\* or higher</span>**, see [docs.exasol.com](https://docs.exasol.com/db/latest/loading_data/load_data_parquet.htm).\n",
262260
"\n",
263261
"Now we can import the Parquet file from S3 into the database:"
264262
]
@@ -271,10 +269,9 @@
271269
"outputs": [],
272270
"source": [
273271
"%%sql \n",
274-
"SELECT 1 -- avoid error from empty cell \n",
275272
"\n",
276-
"-- IMPORT INTO US_FLIGHTS FROM PARQUET AT AI_LAB_FIRST_STEPS_S3 \n",
277-
"-- FILE 'first_steps/US_FLIGHTS_FEB_2024.parquet'"
273+
"IMPORT INTO US_FLIGHTS FROM PARQUET AT AI_LAB_FIRST_STEPS_S3 \n",
274+
" FILE 'first_steps/US_FLIGHTS_FEB_2024.parquet'"
278275
]
279276
},
280277
{
@@ -305,11 +302,6 @@
305302
"\n",
306303
"[PyExasol](https://exasol.github.io/pyexasol/master/index.html) is the basic Python connector for interacting with Exasol databases.\n",
307304
"\n",
308-
"Please note\n",
309-
"* Using PyExasol's import or export functions with Exasol database versions `2025.*` and higher requires Pyexasol version ≥ `1.2`.\n",
310-
"* AI Lab currently is shipped with pyexasol version `0.27.0`.\n",
311-
"* Hence, the Pyexasol examples can only be executed with Exasol database versions < `2025`.\n",
312-
"\n",
313305
"### 3.1 Importing a CSV File from the Local Filesystem\n",
314306
"\n",
315307
"This section demonstrates how to import a CSV file from the local file system into the database using Pyexasol.\n",
@@ -456,8 +448,8 @@
456448
"metadata": {},
457449
"source": [
458450
"**Please note**: \n",
459-
"* Parquet import requires using **<span style=\"color: #40a\">Exasol version 2025.\\* or higher</span>**, see [docs.exasol.com](https://docs.exasol.com/db/latest/loading_data/load_data_parquet.htm).\n",
460-
"* Hence, the import is currently commented out.\n",
451+
"* Parquet import via SQL statement `IMPORT` requires using **<span style=\"color: #40a\">Exasol version 2025.\\* or higher</span>**, see [docs.exasol.com](https://docs.exasol.com/db/latest/loading_data/load_data_parquet.htm).\n",
452+
"* This restriction does not apply when using PyExasol's native function [import_from_parquet()](https://exasol.github.io/pyexasol/master/user_guide/exploring_features/import_and_export/index.html#id8).\n",
461453
"\n",
462454
"Now we can import the Parquet file from S3 into the database:"
463455
]
@@ -471,17 +463,17 @@
471463
"source": [
472464
"from exasol.nb_connector.connections import open_pyexasol_connection\n",
473465
"\n",
474-
"query = \"IMPORT INTO {table!q} FROM PARQUET AT {connection} FILE {file!s}\"\n",
466+
"query = \"IMPORT INTO {table!q} FROM PARQUET AT {connection!r} FILE {file!s}\"\n",
475467
"\n",
476468
"params = {\n",
477469
" \"table\": (ai_lab_config.db_schema, \"US_FLIGHTS\"),\n",
478470
" \"connection\": \"AI_LAB_FIRST_STEPS_S3\",\n",
479471
" \"file\": \"first_steps/US_FLIGHTS_FEB_2024.parquet\",\n",
480472
"}\n",
481473
"\n",
482-
"#with open_pyexasol_connection(ai_lab_config, compression=True) as conn:\n",
483-
"# result = conn.execute(query, params)\n",
484-
"#print(f\"Imported {result.rowcount()} rows.\")"
474+
"with open_pyexasol_connection(ai_lab_config, compression=True) as conn:\n",
475+
" result = conn.execute(query, params)\n",
476+
"print(f\"Imported {result.rowcount()} rows.\")"
485477
]
486478
},
487479
{
@@ -578,9 +570,7 @@
578570
"id": "d11dcb6d-d3e5-4e1e-8d61-ba59897c56e9",
579571
"metadata": {},
580572
"source": [
581-
"**Please note**: \n",
582-
"* Parquet import requires using **<span style=\"color: #40a\">Exasol version 2025.\\* or higher</span>**, see [docs.exasol.com](https://docs.exasol.com/db/latest/loading_data/load_data_parquet.htm).\n",
583-
"* Hence, the import is currently commented out.\n",
573+
"**Please note**: Parquet import via SQL statement `IMPORT` requires using **<span style=\"color: #40a\">Exasol version 2025.\\* or higher</span>**, see [docs.exasol.com](https://docs.exasol.com/db/latest/loading_data/load_data_parquet.htm).\n",
584574
"\n",
585575
"Now we can import the Parquet file from S3 into the database:"
586576
]
@@ -604,9 +594,9 @@
604594
" )\n",
605595
")\n",
606596
"\n",
607-
"# with engine.connect() as conn:\n",
608-
"# result = conn.execute(t)\n",
609-
"# print(f\"Imported {result.rowcount} rows.\")"
597+
"with engine.connect() as conn:\n",
598+
" result = conn.execute(t)\n",
599+
"print(f\"Imported {result.rowcount} rows.\")"
610600
]
611601
},
612602
{

0 commit comments

Comments
 (0)