Spaces:
Sleeping
Sleeping
| title: FactChecker | |
| emoji: π | |
| colorFrom: pink | |
| colorTo: red | |
| sdk: docker | |
| pinned: false | |
| license: mit | |
| short_description: 'FactChecker: Fake News Detector' | |
| #  FactChecker: Fake News Detection Web Application | |
| FactChecker is a web application that detects fake news using various machine learning models. | |
| The system analyzes text input and predicts whether the content is likely to be real or fake news, | |
| providing confidence scores and visualizations to help users understand the results. | |
| ## Features | |
| - **Multiple ML Models**: Choose between three different models or use all of them together: | |
| - Logistic Regression (Accuracy: 90.42%, F1 Score: 87.62%) | |
| - Random Forest (Accuracy: 90.83%, F1 Score: 87.52%) | |
| - DistilBERT (Accuracy: 91.00%, F1 Score: 88.45%) | |
| - **Ensemble Approach**: When selecting "All Models," the system combines predictions using a voting mechanism for more robust results | |
| - **Real-time Analysis**: Instantly assess the credibility of news articles or statements | |
| - **Confidence Scores**: View the model's level of certainty in its predictions | |
| - **Visual Interface**: Color-coded results (green for real, red for fake) for intuitive understanding | |
| ## Technology Stack | |
| ### Backend | |
| - Python 3.11 with Flask 2.0.1 | |
| - NLTK 3.9.1 for natural language processing | |
| - Scikit-learn 1.6.1 for traditional machine learning models | |
| - PyTorch 2.6.0 and Transformers 4.49.0 for the DistilBERT model | |
| - Gunicorn 20.1.0 for production deployment | |
| **Verify the versions before running the BACKEND** | |
| ### Frontend | |
| - React.js for the user interface | |
| - Modern JavaScript (ES6+) | |
| - CSS for styling | |
| ### Data Processing | |
| - Pandas and NumPy for data manipulation | |
| - TF-IDF Vectorization for feature extraction | |
| - Regular expressions for text cleaning | |
| ## Project Structure | |
| ``` | |
| FactChecker/ | |
| βββ build/ # React build files(compiled frontend) | |
| β βββ static/ | |
| β β βββ css/ # Compiled CSS | |
| β β βββ js/ # Compiled JavaScript | |
| β βββ asset-manifest.json | |
| β βββ index.html # Main HTML file | |
| β βββ logo.ico | |
| β βββ logo.png | |
| β βββ manifest.json | |
| βββ model_training/ # Model training materials | |
| β βββ visualizations/ # Generated visualization images | |
| β βββ model_training.ipynb # Jupyter notebook for model training | |
| βββ models/ # Saved ML models | |
| β βββ tfidf_vectorizer.pkl # TF-IDF vectorizer | |
| β βββ lr_model.pkl # Logistic Regression model | |
| β βββ rf_model.pkl # Random Forest model | |
| β βββ distilbert_model.pt # DistilBERT model | |
| βββ .gitattributes | |
| βββ Dockerfile # Docker configuration | |
| βββ README.md | |
| βββ app.py # Flask application | |
| βββ requirements.txt # Python dependencies | |
| ``` | |
| ## Steps | |
| ### For Backend: | |
| 1. Clone the repository | |
| 2. Create a virtual environment and install the dependencies. | |
| ```pip install -r requirements.txt``` | |
| 3. Download NLTK resources: | |
| ```python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords'); nltk.download('wordnet')"``` | |
| 4. Run the application | |
| ```python app.py``` | |
| #### For Frontend: | |
| 1. Install dependencies: | |
| ```npm install ``` | |
| 3. Build the frontend: | |
| ```npm run build``` | |
| #### Model Training | |
| To retrain the models: | |
| 1. Upload the notebook in Google Colab. | |
| 2. Download the ISOT(true.csv, fake.csv) datasets and upload it to the google drive. | |
| 3. Set runtime type to GPU for optimal performance: | |
| ```Go to Runtime β Change runtime type β GPU β Save``` | |
| 4. Activate the runtime. | |
| 5. Execute the notebook cells sequentially to retrain the models. | |