Find Jobs
Hire Freelancers

executing a NLP code with python( Document classification)

₹600-1500 INR

Suoritettu
Julkaistu lähes 3 vuotta sitten

₹600-1500 INR

Maksettu toimituksen yhteydessä
For this problem, you will work a lot with classifying text in chapter 6. You will build a Naive Bayes Classifier from scratch for the task of classifying if an email is spam or not. You will get hands-on experience on how to build a machine learning classifier using NLTK. You can submit a python notebook file for this homework. The answers can be submitted separately in a [login to view URL]’t hesitate to GOOGLE. But, don’t copy the code. .[login to view URL] section 1 in NLTK chapter 6 and familiarize with the document classification example formovie reviews [login to view URL] the dataset from ACL Wiki [login to view URL] There are many Spam datasets. Untarthe dataset. Google untar and find out how you will deal with a non-zip type [login to view URL] read the readme file. [login to view URL] many folders are there in the archive? [login to view URL] is the difference between the different folders?[login to view URL] will work with Part 1 folder in the lemm_stop folder. Show the code snippet to get marks for this [login to view URL] many documents are marked as spam and not spam? How did you come up with the number?[login to view URL] many words are there in all the documents? [login to view URL] are the top 5 frequent words in the spam documents?[login to view URL] are the top 5 frequent words in the non-spam documents?[login to view URL] is the maximum number of words in a document?[login to view URL] is the minimum number of words in a document?[login to view URL] a feature extractor function similar to document_features in the NLTK example. Don’t copy the code from NLTK book. Use the feature extractor function to create a training dataset on Part 1 of the data. Train a Naive Bayes classifier as shown in the book [login to view URL] testing, we will use Part 10 in the lemm_stop folder. Follow similar steps as above to create a test dataset. Apply the feature extractor function to extract features from the test dataset. What is its accuracy on the test dataset? Show the code. What happens if you test on the training dataset? If you get accuracies below 50%, then, there is a bug in your [login to view URL]: What is the Precision, Recall, and F-score of the classifier that you trained? Read section 3 of the chapter to answer these [login to view URL] you try another classifier such as Logistic Regression? How do the evaluation metrics looklike? This is a good starting point to start using [login to view URL] can use scikit-learn’s implementation: [login to view URL] example of working with text data is here: [login to view URL]
Projektin tunnus (ID): 29758917

Tietoa projektista

2 ehdotukset
Etäprojekti
Aktiivinen 3 vuotta sitten

Haluatko ansaita rahaa?

Freelancerin tarjouskilpailun edut

Aseta budjettisi ja aikataulu
Saa maksu työstäsi
Kuvaile ehdotustasi
Rekisteröinti ja töihin tarjoaminen on ilmaista
Myönnetty käyttäjälle:
Käyttäjän avatar
Hi, I am interested in ML project for natural language processing. ================================== Thanks:)
₹1 000 INR 7 päivässä
5,0 (7 arvostelua)
3,1
3,1

Tietoja asiakkaasta

Maan INDIA lippu
Hyderabad, India
0,0
0
Maksutapa vahvistettu
Liittynyt maalisk. 2, 2021

Asiakkaan vahvistus

Kiitos! Olemme lähettäneet sinulle sähköpostitse linkin, jolla voit lunastaa ilmaisen krediittisi.
Jotain meni pieleen lähetettäessä sähköpostiasi. Yritä uudelleen.
Rekisteröitynyttä käyttäjää Ilmoitettua työtä yhteensä
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Ladataan esikatselua
Lupa myönnetty Geolocation.
Kirjautumisistuntosi on vanhentunut ja sinut on kirjattu ulos. Kirjaudu uudelleen sisään.