Check if words sound alike using Natural

Hannah Davis

InstructorHannah Davis

Natural^0.4.0

Share this video with your friends

In this lesson, we’ll take a look at Natural’s phonetics feature. We’ll learn how to check whether two words sound alike, looking at both the SoundEx and Metaphone algorithms.

illustration for Natural Language Processing in JavaScript with Natural

Course

Natural Language Processing in JavaScript with Natural

Yonatan Shalev

Yonatan Shalev

~ 4 years ago

Should I split documents into single sentences or use them as is to train Brain.js text classification model? I was wondering what's the best way to feed the model with training data.

Can i just use the document as is? like this: {"phrase": "First long document with up to 30 sentences", "result": {"label 1": 1}}, {"phrase": "first long document with up to 30 sentences", "result": {"label 2": 1}} {"phrase": "Second long document with up to 30 sentences", "result": {"label 2": 1}}, etc. Or, should I split all documents into sentences and then the data will look like something this: {"phrase": "Sentence 1 out of document 1", "result": {"label 1": 1}}, {"phrase": "Sentence 2 out of document 1", "result": {"label 2": 1}}, etc.

{"phrase": "Sentence 1 out of document 2", "result": {"label 5": 1}}, etc.

{"phrase": "Sentence X out of document X", "result": {"No labels at all": 1}}, etc. Same question about using the model, should I just apply it on the complete document or should I split it to separate sentences then apply the model on each sentence.

What's the best practice?