Extracts structured location information from unstructured incident text using a local LLM, geocodes results, and visualizes them on a Folium map. Designed to keep sensitive text processing local while using pluggable geocoders (Google Maps, OSM/Nominatim, etc.).
This script automates the extraction and visualization of location information from unstructured text, particularly focusing on security incidents in conflict zones. It utilizes a local Large Language Model (LLM) to ensure privacy while extracting location details from sensitive text, which are then formatted for submission to a geocoding system.
Google Maps is employed in this implementation, but users can configure and use an in-house OpenStreetMap geocoding system or any other for enhanced privacy and autonomy. The resulting locations are plotted on an interactive map using Folium, facilitating comprehensive visual analysis. This workflow enables users to process sensitive data locally, addressing privacy concerns and providing detailed geographical insights for informed decision-making.
Additionally, the script leverages LMStudio (https://lmstudio.ai/) in local server mode, ensuring privacy, easy testing, and seamless switching between multiple models. Performance testing on a local machine with an Nvidia T1200 4GB GPU indicates good performance. Performance for the extraction can be improve by using a more powerful LLM localy or an online one like ChatGPT.
Extracts and structures location data from CSV files containing conflict-related security incidents. Utilizes a local Large Language Model (LLM) for privacy-preserving location extraction. Formats extracted data for submission to Google Maps' geocoding system. Visualizes finalized locations on an interactive map using Folium. Exported to a .xlsx file for further analysis.
Ensure LMStudio (https://lmstudio.ai/) is set up and runs a local server mode. The following models were tested for location extraction:
- Phi-3-mini-128k-instruct-GGUF-Imatrix-smashed (https://huggingface.co/PrunaAI/Phi-3-mini-128k-instruct-GGUF-Imatrix-smashed)
- OpenHermes-2.5-Mistral-7B-GGUF (https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GGUF) Larger models can be used and will improve results, but will require a more powerful GPU.
You will need a Google Map API key to run the code. If not, you can set up your own instance of OSM or use Nominatim
Tested on a local machine with an Nvidia T1200 4GB GPU, the script demonstrates good performance with the specified models.
This project is licensed under the MIT License - see the LICENSE file for details.
- LMStudio for providing local LLM capabilities
- Hugging Face for hosting the language models
- Folium for mapping capabilities
- The open-source community for various geocoding solutions
- Issues: GitHub Issues
- Discussions: GitHub Discussions
⭐ Star this repository if you find it helpful!