Natural Language Processing

I’ve recently updated some matlab code about machine learning on to github. Today I’ll be adding to that with some Natural Language Processing (NLP) Code. The main concepts we covered in class were ngram modelling which is a markovian process. This means that future states or values have a conditional dependence on the past values. In NLP this concept is utilized via training n gram probabilities models on given texts. For example, if we specify N to equal to 3, then each word in a given sentence depends on the last two words.

So the equation for conditional probability is given by:


Extending this to multiple sequential events, this is generalized to be (chain rule)

CodeCogsEqn (1)


This above equation is very useful for modelling sequential stuff like sentences. Extension to these concepts to finance are utilized heavily in hidden Markov models that attempts to model states in various markets.  I hope the interested reader comment below for other interesting applications.

The last topic we are covering is class is computer vision. As of now, topics like image noise reduction via Gaussian filtering, edge detection, segmentation are being covered. I will post more about them in the future.

Code Link




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s