Advanced neural network based on https://github.com/RusZ/TextClassifier
TODO
- Mention about DDD paradygme in description
- Transfer network learning to Apache Ignite platform or S3 Cloud
Application for text classification using neural networks.
- Java SE Development Kit 8 (
jdk-1.8
)
- Encog Machine Learning Framework (
org.encog:encog-core:3.3.0
) - Apache POI (
org.apache.poi:poi-ooxml:3.16
) - SQLiteJDBC (
org.xerial:sqlite-jdbc:3.19.3
) - JUnit (
junit:junit:4.12
) - H2 Database Engine (
com.h2database:h2:1.4.196
) - Mockito (
org.mockito:mockito-core:2.8.47
) - Hibernate ORM (
org.hibernate:hibernate-core:5.2.10.Final
,org.hibernate:hibernate-entitymanager:5.2.10.Final
) - SLF4J (
org.slf4j:slf4j-log4j12:1.7.25
) - Javassist (
org.javassist:javassist:3.22.0-CR2
)
Parameter | Description | Possible values |
---|---|---|
db_path | Path for database files and trained classifiers | Example: ./db |
dao_type | Method of data storage and access | jdbc, hibernate |
dbms_type | Database management system | sqlite, h2 |
db_filename | Database name | Example: TextClassifier |
ngram_strategy | Text splitting algorithm | unigram, filtered_unigram, bigram, filtered_bigram |
- When you launch application first time, it will ask you for XLSX-file with data for training. The file can include one or two sheets. First sheet should contain data for training, second sheet should contain data for testing of accurancy. File structure:
- After that application will build vocabulary, will create and train neural network for each Characteristic.
- Restart application and use it for text classification.
This project is licensed under the MIT License - see the LICENSE file for details.