diff --git a/Analyzing_Data/Raster_Text_To_Segments.ipynb b/Analyzing_Data/Raster_Text_To_Segments.ipynb
new file mode 100644
index 0000000..faf502e
--- /dev/null
+++ b/Analyzing_Data/Raster_Text_To_Segments.ipynb
@@ -0,0 +1,489 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "\n",
+ "# WherobotsAI Raster Inference (Text-Prompted Models)\n",
+ "\n",
+ "In this notebook we introduce text-prompted Raster Inference in WherobotsAI, powered by Meta AI’s SAM2 and Google DeepMind’s OWLv2 models. Raster Inference brings the ability to run advanced vision models directly on large-scale satellite imagery, turning pixels into analysis-ready vector features at scale. \n",
+ "\n",
+ "We will demonstrate how to apply these models to NAIP imagery of Miami Airport, using `RS_Text_to_BBoxes` (OWLv2) for object detection and `RS_Text_to_Segments` (SAM2) for segmentation. \n",
+ "\n",
+ "[Read more about Wherobots Raster Inference in the documentation](https://docs.wherobots.com/latest/tutorials/wherobotsai/wherobots-inference/raster-inference-overview/?h=raster+inference). "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Define Sedona context"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from sedona.spark import *\n",
+ "from pyspark.sql.functions import expr\n",
+ "\n",
+ "config = SedonaContext.builder().getOrCreate()\n",
+ "sedona = SedonaContext.create(config)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Load NAIP imagery for Miami Airport\n",
+ "\n",
+ "We will use the native raster reader to load GeoTIFFs as out-of-database or \"out-db\" rasters and perform dyanamic tiling on read.\n",
+ "Spliting the large GeoTIFF into small tiles improves the distribution of workload across the cluster."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "url = \"s3://wherobots-examples/data/naip/miami-airport.tiff\"\n",
+ "tile_size = 256\n",
+ "\n",
+ "df = sedona.read.format(\"raster\")\\\n",
+ " .option(\"tileWidth\", tile_size)\\\n",
+ " .option(\"tileHeight\", tile_size)\\\n",
+ " .load(url)\n",
+ "\n",
+ "df.createOrReplaceTempView(\"df\")\n",
+ "df.show(5)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Viewing Raster Inputs\n",
+ "\n",
+ "Before running inference, it’s useful to explore the imagery itself. \n",
+ "We’ll visualize the footprints of the raster tiles with SedonaKepler and preview a few raw images using SedonaUtils. This gives us confidence that the data is being read correctly and aligned spatially before applying any models.\n",
+ "\n",
+ "> Tip: You can also save the Kepler map as an interactive HTML file with `kepler_map.save_to_html()`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "kepler_map = SedonaKepler.create_map()\n",
+ "df = df.withColumn('footprint', expr(\"ST_TRANSFORM(RS_CONVEXHULL(rast),'EPSG:4326')\"))\n",
+ "SedonaKepler.add_df(kepler_map, df=df, name=\"Image Footprints\")\n",
+ "\n",
+ "kepler_map"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "htmlDf = sedona.sql(f\"\"\"SELECT RS_AsImage(rast, 250) as FROM df limit 5\"\"\")\n",
+ "SedonaUtils.display_image(htmlDf)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Running segementation using SAM2 model\n",
+ "\n",
+ "Now we’ll run inference over the raster tiles using Wherobots SQL function `RS_Text_to_Segments()`.\n",
+ "\n",
+ "For this example, we specify the following parameters -\n",
+ "- Model: `\"sam2\"`\n",
+ "- Text prompt: `\"airplanes\"`\n",
+ "- Confidence threshold: `0.5`\n",
+ "\n",
+ "The confidence threshold controls which detections are returned (scores range from 0 to 1). \n",
+ "For this particular model, most positives have confidence scores of at most ~0.7, so we start with `0.5` to favor **recall** on this model/dataset. This helps us understand how the model performs before tightening the threshold later.\n",
+ "\n",
+ "The function returns predicted segments for each raster in our region of interest.\n",
+ "We’ll cache the results for efficiency and register them as a temporary view so we can explore them further."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "model_id = \"sam2\"\n",
+ "prompt = \"airplanes\"\n",
+ "threshold = 0.5\n",
+ "\n",
+ "preds = sedona.sql(f\"\"\"\n",
+ " SELECT \n",
+ " rast, \n",
+ " RS_TEXT_TO_SEGMENTS(\n",
+ " '{model_id}', \n",
+ " rast, \n",
+ " '{prompt}', \n",
+ " {threshold}\n",
+ " ) AS preds \n",
+ " FROM df\n",
+ "\"\"\")\n",
+ "\n",
+ "preds.cache().count()\n",
+ "preds.createOrReplaceTempView(\"preds\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Preparing Results\n",
+ "\n",
+ "The raw inference output contains arrays of predictions (segments, confidence scores, and labels) for each raster tile. \n",
+ "Before plotting, we need to flatten these arrays so that each row corresponds to a single prediction.\n",
+ "\n",
+ "We’ll: \n",
+ "1. Filter out empty or invalid results. \n",
+ "2. Use `arrays_zip` + `explode` to turn lists of predictions into individual rows. \n",
+ "3. Keep the geometry, confidence score, and label for each detected object. \n",
+ "\n",
+ "This prepares the data for mapping and analysis with SedonaKepler."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "preds_filtered = sedona.sql(f\"\"\"\n",
+ " SELECT *\n",
+ " FROM preds\n",
+ " WHERE\n",
+ " size(preds.labels) > 0\n",
+ " AND array_contains(preds.labels, 1)\n",
+ " AND NOT array_contains(preds.segments_wkt, 'POLYGON EMPTY')\n",
+ "\"\"\")\n",
+ "\n",
+ "preds_filtered.createOrReplaceTempView(\"preds_filtered\")\n",
+ "preds_filtered.show(5)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "exploded = sedona.sql(\"\"\"\n",
+ " SELECT\n",
+ " rast,\n",
+ " exploded_predictions.*\n",
+ " FROM\n",
+ " preds_filtered\n",
+ " LATERAL VIEW explode(arrays_zip(preds.segments_wkt, preds.confidence_scores, preds.labels)) AS exploded_predictions\n",
+ " WHERE\n",
+ " exploded_predictions.confidence_scores != 0.0\n",
+ "\"\"\")\n",
+ "exploded.cache().count()\n",
+ "exploded.createOrReplaceTempView(\"exploded\")\n",
+ "exploded.show(5)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Viewing segmentation results\n",
+ "\n",
+ "Let’s map the predicted segments to sanity-check geometry and coverage.\n",
+ "We’ll add the detections as a layer in SedonaKepler—use the tooltip to inspect confidence per feature."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "kepler_map = SedonaKepler.create_map()\n",
+ "SedonaKepler.add_df(kepler_map, df=exploded, name=\"Airplane Detections\")\n",
+ "\n",
+ "kepler_map"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Overlay results on source imagery\n",
+ "\n",
+ "To see detections on top of the underlying rasters, use `show_detections`.\n",
+ "It expects the non-exploded predictions (arrays per tile) and can filter by confidence.\n",
+ "\n",
+ "> Tip: show_detections works with DataFrames that still have the raster column and arrays of results; exploded DataFrames aren’t supported.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "unpacked_preds_df = sedona.sql(\"\"\"\n",
+ " SELECT\n",
+ " rast,\n",
+ " preds.segments_wkt,\n",
+ " preds.confidence_scores,\n",
+ " preds.labels\n",
+ " FROM preds_filtered\n",
+ "\"\"\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from wherobots.inference.plot.detections import show_detections\n",
+ "\n",
+ "show_detections(\n",
+ " unpacked_preds_df,\n",
+ " confidence_threshold=0.7,\n",
+ " plot_geoms=True,\n",
+ " geometry_column=\"segments_wkt\",\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Running Object Detection using OWLv2 model\n",
+ "\n",
+ "Now we’ll run inference over the raster tiles using Wherobots’ SQL function `RS_Text_to_BBoxes()`.\n",
+ "\n",
+ "For this example, we specify the following parameters –\n",
+ "- Model: `\"owlv2\"`\n",
+ "- Text prompt: `\"airplanes\"`\n",
+ "- Confidence threshold: `0.5`\n",
+ "\n",
+ "The function returns predicted bounding boxes for each raster in our region of interest.\n",
+ "\n",
+ "We’ll cache the results for efficiency and register them as a temporary view so we can explore them further."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "model_id = \"owlv2\"\n",
+ "prompt = \"airplanes\"\n",
+ "threshold = 0.5\n",
+ "\n",
+ "preds = sedona.sql(f\"\"\"\n",
+ " SELECT \n",
+ " rast, \n",
+ " RS_TEXT_TO_BBoxes(\n",
+ " '{model_id}', \n",
+ " rast, \n",
+ " '{prompt}', \n",
+ " {threshold}\n",
+ " ) AS preds \n",
+ " FROM df\n",
+ "\"\"\")\n",
+ "preds.cache().count()\n",
+ "preds.createOrReplaceTempView(\"preds\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Preparing Results\n",
+ "\n",
+ "The raw inference output contains arrays of predictions (bounding boxes, confidence scores, and labels) for each raster tile. \n",
+ "Before plotting, we need to flatten these arrays so that each row corresponds to a single prediction.\n",
+ "\n",
+ "We’ll: \n",
+ "1. Filter out empty or invalid results. \n",
+ "2. Use `arrays_zip` + `explode` to turn lists of predictions into individual rows. \n",
+ "3. Keep the geometry, confidence score, and label for each detected object. \n",
+ "\n",
+ "This prepares the data for mapping and analysis with SedonaKepler."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "preds_filtered = sedona.sql(f\"\"\"\n",
+ " SELECT *\n",
+ " FROM preds\n",
+ " WHERE\n",
+ " size(preds.labels) > 0\n",
+ " AND array_contains(preds.labels, 1)\n",
+ " AND NOT array_contains(preds.bboxes_wkt, 'POLYGON EMPTY')\n",
+ "\"\"\")\n",
+ "preds_filtered.createOrReplaceTempView(\"preds_filtered\")\n",
+ "preds_filtered.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Each raster tile returns multiple predictions as arrays. \n",
+ "We use `explode` to flatten these arrays so every detected object becomes its own row for easier mapping and analysis."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "exploded = sedona.sql(\"\"\"\n",
+ " SELECT\n",
+ " rast,\n",
+ " exploded_predictions.*\n",
+ " FROM\n",
+ " preds_filtered\n",
+ " LATERAL VIEW explode(arrays_zip(preds.bboxes_wkt, preds.confidence_scores, preds.labels)) AS exploded_predictions\n",
+ " WHERE\n",
+ " exploded_predictions.confidence_scores != 0.0\n",
+ "\"\"\")\n",
+ "exploded.cache().count()\n",
+ "exploded.createOrReplaceTempView(\"exploded\")\n",
+ "exploded.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Viewing Object Detection results\n",
+ "\n",
+ "Let’s map the predicted Airplane bboxes to sanity-check geometry and coverage.\n",
+ "We’ll add the detections as a layer in SedonaKepler—use the tooltip to inspect confidence per feature."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "kepler_map = SedonaKepler.create_map()\n",
+ "SedonaKepler.add_df(kepler_map, df=exploded, name=\"Airplane Detections\")\n",
+ "\n",
+ "kepler_map"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Overlay results on source imagery\n",
+ "\n",
+ "To see detections on top of the underlying rasters, use `show_detections`.\n",
+ "It expects the non-exploded predictions (arrays per tile) and can filter by confidence.\n",
+ "\n",
+ "> Tip: show_detections works with DataFrames that still have the raster column and arrays of results; exploded DataFrames aren’t supported."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "unpacked_preds_df = sedona.sql(\"\"\"\n",
+ " SELECT\n",
+ " rast,\n",
+ " preds.bboxes_wkt,\n",
+ " preds.confidence_scores,\n",
+ " preds.labels\n",
+ " FROM preds_filtered\n",
+ "\"\"\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We see below that OWLv2 and SAM2 do remarkably well at identifying airplanes with little user effort! Previously, achieving similar results was a significant undertaking. An entire Machine Learning engineering team would have needed to build such a model from scratch."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from wherobots.inference.plot.detections import show_detections\n",
+ "\n",
+ "show_detections(\n",
+ " unpacked_preds_df,\n",
+ " confidence_threshold=0.5,\n",
+ " plot_geoms=True,\n",
+ " side_by_side=False,\n",
+ " geometry_column=\"bboxes_wkt\",\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Next Steps with Raster Inference\n",
+ "\n",
+ "With access to general-purpose, text-promptable models, what will you predict and georeference next?\n",
+ "\n",
+ "Some ideas on next steps to try, include:\n",
+ "\n",
+ "* Predicting different objects next to the airplanes in the image tiles above using new text prompts.\n",
+ "* Adjusting the confidence score threshold for `RS_Text_to_Segments` or `RS_Text_to_BBoxes` to see how SAM2 or OWLv2 respond.\n",
+ "* Loading a new imagery dataset with our [STAC Reader](https://docs.wherobots.com/latest-snapshot/references/wherobotsdb/vector-data/Stac/) and try to predict a different feature of interest, such as agriculture, buildings, or tree crowns.\n",
+ "\n",
+ "We're excited to hear about what you're doing with SAM2 and OWLv2! "
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/Analyzing_Data/Raster_Text_To_Segments_Airplanes.ipynb b/Analyzing_Data/Raster_Text_To_Segments_Airplanes.ipynb
deleted file mode 100755
index 4b61359..0000000
--- a/Analyzing_Data/Raster_Text_To_Segments_Airplanes.ipynb
+++ /dev/null
@@ -1,407 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n",
- "# WherobotsAI Raster Inference - Predicting Bounding Boxes and Segments of Airplanes from an LLM-based Text Query\n",
- "\n",
- "\n",
- "#### Learn how to perform text-based object detection on aerial imagery using WherobotsAI's Raster Inference with **Segment Anything Model 2 (SAM2)** and **Google Deepmind's Open Vocabulary Object Detection (OWLv2)**.\n",
- "**Note:** Running Raster Inference notebooks requires a [GPU-Optimized](https://docs.wherobots.com/latest/develop/runtimes/#runtime-types) runtime and a Professional or Enterprise Edition Organization. If you haven't already done so, file a [**compute request**](https://docs.wherobots.com/latest/develop/runtimes/#compute-requests) to obtain a GPU-Optimized runtime.\n",
- "\n",
- "#### Follow along as we use the simple term, \\\"airplanes\\\", to identify, segment, and draw bounding boxes around **commercial aircraft** in satellite imagery of Miami airport."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Start WherobotsDB"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from sedona.spark import *\n",
- "from pyspark.sql.functions import expr\n",
- "\n",
- "config = (\n",
- " SedonaContext.builder()\n",
- " .getOrCreate()\n",
- ")\n",
- "\n",
- "sedona = SedonaContext.create(config)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Load Aerial Imagery Efficiently\n",
- "\n",
- "In this step, we'll load the aerial imagery so we can run **inference** in a later step.\n",
- "\n",
- "The GeoTIFF image is large, so we'll split it into tiles and load those tiles as **out-of-database** or **\\\"out-db\\\" rasters** in **WherobotsDB**."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "url = \"s3://wherobots-examples/data/naip/miami-airport.tiff\"\n",
- "tile_size = 256\n",
- "df = sedona.read.format(\"raster\").option(\"tileWidth\", tile_size).option(\"tileHeight\", tile_size).load(url)\n",
- "df.createOrReplaceTempView(\"df\")\n",
- "df.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Viewing the Model's Imagery Inputs\n",
- "\n",
- "We can see the footprints of the tiled images with the `SedonaKepler.create_map()` integration. Using `SedonaUtils.display_image()` we can view the images as well.\n",
- "\n",
- " **Tip:** Save the map to a html file using `kepler_map.save_to_html()`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "kepler_map = SedonaKepler.create_map()\n",
- "df = df.withColumn('footprint', expr(\"ST_TRANSFORM(RS_CONVEXHULL(rast),'EPSG:4326')\"))\n",
- "SedonaKepler.add_df(kepler_map, df=df, name=\"Image Footprints\")\n",
- "\n",
- "kepler_map"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "htmlDf = sedona.sql(f\"\"\"SELECT RS_AsImage(rast, 250) as FROM df limit 5\"\"\")\n",
- "SedonaUtils.display_image(htmlDf)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Run Inference and Visualize Results\n",
- "\n",
- "To run inference, specify the model to use with `model id`. Five models are pre-loaded and made available in **Wherobots Cloud** to Professional and Enterprise customers. You can also load your own models, learn more about that process [here](https://docs.wherobots.com/latest/tutorials/wherobotsai/wherobots-inference/raster-inference-overview/#bring-your-own-model-guide).\n",
- "\n",
- "Inference can be run using **Wherobots' Spatial SQL functions**, in this case: `RS_Text_to_Segments()`.\n",
- " \n",
- "Here, we generate predictions for all images in the Region of Interest (ROI). In the output, a label value of 1 signifies a positive prediction corresponding to the input text prompt.\n",
- "\n",
- "Then, we'll filter and print some of the results to see how our positive detection results look."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "model_id = \"sam2\"\n",
- "prompt = \"airplanes\"\n",
- "threshold = 0.5\n",
- "\n",
- "preds = sedona.sql(\n",
- " f\"\"\"SELECT rast, RS_TEXT_TO_SEGMENTS('{model_id}', rast, '{prompt}', {threshold}) AS preds from df\"\"\"\n",
- ")\n",
- "preds.cache().count()\n",
- "preds.createOrReplaceTempView(\"preds\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Prepare Results\n",
- "\n",
- "Before plotting our predictions, we need to transform our results.\n",
- "\n",
- "We'll need to transform our table so that each raster scene _only_ corresponds to a single predicted bounding box instead of every bounding box prediction.\n",
- "\n",
- "**Bounding boxes (or Bboxes)** are essentially boundaries drawn around an object of interest. \n",
- "\n",
- "To do this, combine the list columns containing our prediction results (`max_confidence_bboxes`, `max_confidence_scores`, and `max_confidence_labels`) with `arrays_zip`. Then, use `explode` to convert lists to rows.\n",
- "\n",
- "To map the results with `SedonaKepler`, convert the `max_confidence_bboxes` column to a `GeometryType` column with `ST_GeomFromWKT`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "preds_filtered = sedona.sql(f\"\"\"\n",
- " SELECT *\n",
- " FROM preds\n",
- " WHERE\n",
- " size(preds.labels) > 0\n",
- " AND array_contains(preds.labels, 1)\n",
- " AND NOT array_contains(preds.segments_wkt, 'POLYGON EMPTY')\n",
- "\"\"\")\n",
- "preds_filtered.createOrReplaceTempView(\"preds_filtered\")\n",
- "preds_filtered.show()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "exploded = sedona.sql(\"\"\"\n",
- "SELECT\n",
- " rast,\n",
- " exploded_predictions.*\n",
- "FROM\n",
- " preds_filtered\n",
- "LATERAL VIEW explode(arrays_zip(preds.segments_wkt, preds.confidence_scores, preds.labels)) AS exploded_predictions\n",
- "WHERE\n",
- " exploded_predictions.confidence_scores != 0.0\n",
- "\"\"\")\n",
- "exploded.cache().count()\n",
- "exploded.createOrReplaceTempView(\"exploded\")\n",
- "exploded.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Viewing Model Results: Airplane Segmentation Predictions\n",
- "\n",
- "Just like we visualized the footprints of the tiled images earlier, we can also view our prediction geometries! Highlight a prediction to view its confidence score."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "kepler_map = SedonaKepler.create_map()\n",
- "SedonaKepler.add_df(kepler_map, df=exploded, name=\"Airplane Detections\")\n",
- "\n",
- "kepler_map"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "To view results on the underlying imagery used by the model, you can use the `show_detections` function. This function accepts a Dataframe containing an `outdb_raster` column as well as other arguments to control the plot result. Check out the full docs for the function by calling `show_detections?`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from wherobots.inference.plot.detections import show_detections\n",
- "show_detections?"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "unpacked_preds_df = sedona.sql(\"SELECT rast, preds.* FROM preds_filtered\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "show_detections(\n",
- " unpacked_preds_df,\n",
- " confidence_threshold=0.7,\n",
- " plot_geoms=True,\n",
- " geometry_column=\"segments_wkt\",\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Running Object Detection with a Text Prompt\n",
- "\n",
- "We can also get bounding box predictions instead of segments using `RS_Text_To_BBoxes`. BBoxes, or bounding boxes, are more useful when you are only concerned with counting and localizing objects rather than delineating exact shape and area with `RS_Text_To_Segments`.\n",
- "\n",
- "The inference process is largely the same for `RS_Text_To_BBoxes` and `RS_Text_To_Segments`.\n",
- "\n",
- "There are 2 key differences:\n",
- "* Using the `owlv2` `model_id` instead of `sam2`.\n",
- "* Changing our SQL queries to operate on the `bboxes_wkt` column instead of the `segments_wkt` column when working with prediction results."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "model_id = \"owlv2\"\n",
- "prompt = \"airplanes\"\n",
- "threshold = 0.5\n",
- "\n",
- "preds = sedona.sql(\n",
- " f\"\"\"SELECT rast, RS_TEXT_TO_BBoxes('{model_id}', rast, '{prompt}', {threshold}) AS preds from df\"\"\"\n",
- ")\n",
- "preds.cache().count()\n",
- "preds.createOrReplaceTempView(\"preds\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Just like before, we'll filter predictions by labels, remove empty predictions, and show the results in a browsable map and on top of the original imagery for comparison."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "preds_filtered = sedona.sql(f\"\"\"\n",
- " SELECT *\n",
- " FROM preds\n",
- " WHERE\n",
- " size(preds.labels) > 0\n",
- " AND array_contains(preds.labels, 1)\n",
- " AND NOT array_contains(preds.bboxes_wkt, 'POLYGON EMPTY')\n",
- "\"\"\")\n",
- "preds_filtered.createOrReplaceTempView(\"preds_filtered\")\n",
- "preds_filtered.show()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "exploded = sedona.sql(\"\"\"\n",
- "SELECT\n",
- " rast,\n",
- " exploded_predictions.*\n",
- "FROM\n",
- " preds_filtered\n",
- "LATERAL VIEW explode(arrays_zip(preds.bboxes_wkt, preds.confidence_scores, preds.labels)) AS exploded_predictions\n",
- "WHERE\n",
- " exploded_predictions.confidence_scores != 0.0\n",
- "\"\"\")\n",
- "exploded.cache().count()\n",
- "exploded.createOrReplaceTempView(\"exploded\")\n",
- "exploded.show()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "kepler_map = SedonaKepler.create_map()\n",
- "SedonaKepler.add_df(kepler_map, df=exploded, name=\"Airplane Detections\")\n",
- "\n",
- "kepler_map"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "unpacked_preds_df = sedona.sql(\"SELECT rast, preds.* FROM preds_filtered\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We see below that OWLv2 and SAM2 do remarkably well at identifying airplanes with little user effort! Previously, achieving similar results was a significant undertaking. An entire Machine Learning engineering team would have needed to build such a model from scratch."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "show_detections(\n",
- " unpacked_preds_df,\n",
- " confidence_threshold=0.5,\n",
- " plot_geoms=True,\n",
- " side_by_side=False,\n",
- " geometry_column=\"bboxes_wkt\",\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Next Steps with Raster Inference\n",
- "\n",
- "With access to general-purpose, text-promptable models, what will you predict and georeference next?\n",
- "\n",
- "Some ideas on next steps to try, include:\n",
- "\n",
- "* Predicting different objects next to the airplanes in the image tiles above using new text prompts.\n",
- "* Adjusting the confidence score threshold for `RS_Text_to_Segments` or `RS_Text_to_BBoxes` to see how SAM2 or OWLv2 respond.\n",
- "* Loading a new imagery dataset with our [STAC Reader](https://docs.wherobots.com/latest-snapshot/references/wherobotsdb/vector-data/Stac/) and try to predict a different feature of interest, such as agriculture, buildings, or tree crowns.\n",
- "\n",
- "We're excited to hear about what you're doing with SAM2 and OWLv2! "
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
diff --git a/README.md b/README.md
index 13bea67..43664d2 100644
--- a/README.md
+++ b/README.md
@@ -31,7 +31,7 @@ will pass as the pre-commit hooks will fix the issues it finds.
| |-- Object_Detection.ipynb
| |-- Raster_Classification.ipynb
| |-- Raster_Segmentation.ipynb
-| |-- Raster_Text_To_Segments_Airplanes.ipynb
+| |-- Raster_Text_To_Segments.ipynb
| `-- Zonal_Stats_ESAWorldCover_Texas.ipynb
|-- CONTRIBUTING.md
|-- Getting_Started