BinSight is an advanced APK analysis tool that dissects native libraries (.so
files) and leverages the power of Large Language Models (LLMs)—augmented by live web search results—to determine their purpose, developer, and potential security implications. It automates the reverse engineering process of binary analysis, making it faster, more accurate, and more accessible.
- APK Extraction: Automatically extracts all
.so
native libraries from a given Android APK. - Live Web Search: Performs a web search for each library to gather real-time, public information about its developer and purpose.
- Multi-Arch Disassembly: Uses
pyelftools
andcapstone
to disassemble code for ARM, ARM64, x86, and x86-64 architectures. - Rich Data Extraction: Pulls not just assembly code, but also function names and embedded strings for a more context-rich analysis.
- Flexible LLM Integration: Powered by
litellm
, it supports over 100 LLMs from various providers (OpenAI, Google, Anthropic, Cohere, and any OpenAI-compatible API). - Configurable & Easy to Use: Simple command-line interface allows you to specify the target APK, choose your LLM, and configure custom API endpoints and keys.
-
Clone the repository (optional): If you have the project files, you can skip this step.
git clone <repository_url> cd binsight-project
-
Install dependencies: Ensure you have Python 3.6+ installed. Then, install the required packages from
requirements.txt
.pip install -r requirements.txt
The script is run from the command line with several options to customize its behavior.
python binsight.py <target_path> [options]
target_path
: (Required) The path to a single.apk
file or a directory containing multiple.apk
files.--model
: The LLM model to use, inlitellm
format (e.g.,gemini/gemini-1.5-flash
).--api_key
: Your API key for the chosen provider. If not set, the tool will look for a corresponding environment variable (e.g.,GOOGLE_API_KEY
,OPENAI_API_KEY
).--api_base
: The API base URL for non-standard providers like SiliconFlow or a self-hosted model.
This is the simplest use case. It assumes you have your Google API key set in the environment.
-
Set the environment variable:
export GOOGLE_API_KEY="your_google_api_key"
-
Run the analysis:
python binsight.py /path/to/your_app.apk --model "gemini/gemini-1.5-flash"
This example shows how to use an OpenAI-compatible endpoint, like SiliconFlow. Based on the official LiteLLM Documentation, you must prefix the model name with openai/
to route the request correctly.
python binsight.py /path/to/your_app.apk \
--model "openai/Qwen/Qwen3-235B-A22B" \
--api_base "https://api.siliconflow.cn/v1" \
--api_key "your_siliconflow_api_key"
Note: The openai/
prefix is required for litellm
to use its standard OpenAI-compatible client.
- Extract: The input APK is unzipped, and all
.so
files are copied to a temporary location. - Web Search: For each
.so
file, BinSight performs a web search to find its likely purpose and developer. - Disassemble: The tool identifies the library's architecture, locates the executable
.text
section, and disassembles the machine code into human-readable assembly instructions. - Analyze: A detailed prompt containing the web search results, filename, assembly code, function names, and strings is sent to the configured LLM via
litellm
. - Report: The LLM's conclusion about each library's purpose and developer is collected and printed in a final summary report.
- Clean Up: All temporary files are deleted.
This tool uses litellm
to interact with language models. This means you can use any of the 100+ models supported by litellm
.
- To find the correct model identifier string, please refer to the official LiteLLM Provider List.
- Universal LLM Interface: BinSight uses
litellm
as a unified gateway to over 100 LLM providers. This removes the need for provider-specific code and allows for seamless integration of new models. - Dynamic Analysis Pipeline:
- APK Deconstruction: Extracts all unique
.so
libraries from the target APK. - Metadata Extraction: Uses
pyelftools
andCapstone
to get assembly code, function names, and embedded strings from each library. - Intelligent Analysis via LLM: Sends this rich metadata package to the user-selected LLM. The prompt directs the model to act as a security expert, first identifying the library by name using its internal knowledge, then corroborating that with the provided binary evidence.
- APK Deconstruction: Extracts all unique
- Unified Results: It presents a clear, concise summary of the likely purpose for each analyzed library.
-
Clone the repository.
-
Install Dependencies:
pip install -r requirements.txt
-
Set API Keys (Environment Variables):
litellm
automatically finds API keys set as environment variables. Set the key for the provider you intend to use.# For OpenAI models (gpt-4o, gpt-4-turbo, etc.) export OPENAI_API_KEY="YOUR_OPENAI_KEY" # For Google models (gemini/gemini-1.5-pro, etc.) export GEMINI_API_KEY="YOUR_GEMINI_KEY" # For SiliconFlow models export SILICONFLOW_API_KEY="YOUR_SILICONFLOW_KEY"
Run BinSight against a single APK file or an entire directory. The --model
argument is now the central piece of the command.
You specify the model using the format recognized by litellm
. Here are some common examples:
- Gemini:
gemini/gemini-2.5-pro
- SiliconFlow:
openai/Qwen/Qwen3-32B
# Analyze with OpenAI's GPT-4o (requires OPENAI_API_KEY)
python binsight.py /path/to/app.apk --model gpt-4o --api_key "YOUR_KEY"
# Analyze with Google's Gemini Pro (requires GEMINI_API_KEY)
python binsight.py /path/to/app.apk --model gemini/gemini-2.5-flash --api_key "YOUR_KEY"
# Analyze with SiliconFlow's Qwen/Qwen3-32B, providing the key directly
python binsight.py /path/to/app.apk --model "openai/Qwen/Qwen3-32B" --api_base "https://api.siliconflow.cn/v1" --api_key "YOUR_KEY"
input_path
: Required. Path to an APK file or a directory of APKs.--model
: Required. The model identifier forlitellm
.--api_key
: Optional. Provide the API key directly. Overrides environment variables.--api_base
: Optional. The API base URL for custom providers (e.g., SiliconFlow, local models).
Here is a sample output from analyzing an APK using openai/Qwen/Qwen3-32B
.
$ python binsight.py /path/to/some.apk --model openai/Qwen/Qwen3-32B --api_base "https://api.siliconflow.cn/v1" --api_key "sk-xxx"
==================================================
Starting analysis for: some.apk
==================================================
--- Comprehensive Analysis (Disassembly + LLM) ---
[*] Processing: libflutter.so (from lib/arm64-v8a/libflutter.so)
-> Analyzing with litellm (model: openai/Qwen/Qwen3-32B, attempt: 1/3)...
[+] LLM analysis result: Intent: Google Flutter UI Framework | Confidence: High | Evidence: Known library name confirmed by numerous 'flutter::' and 'dart::' function names and strings like 'Flutter Engine'.
------------------------------------
Final Analysis Summary for some.apk
------------------------------------
Analysis complete! Found 1 SDKs or code intents:
- Intent Analysis - libflutter.so
Intent: Google Flutter UI Framework | Confidence: High | Evidence: Known library name confirmed by numerous 'flutter::' and 'dart::' function names and strings like 'Flutter Engine'.