The purpose of this blog is to simulate the mindset of a data scientist as he/she approaches a data driven problem.I will do this by attempting to introduce who I am with a psycho philosophical twist.
Hopefully, by the end of this blog, I will come to some sort of meaningful conclusion of what I know about myself, and why I chose the glorious field of data science.
Who am i?
Well that’s easy, I am a combination of a consciousness and a collective unconsciousness. (start broad before you narrow down)
As sufficient as I think that answer is, its satisfaction would be on the same scale as saying “Hi I am Paul!” which is not enough. So let’s revisit the question…
Who am I?
I don’t know… I can only really be defined as what I tell you right now until you set the variable Paul equal to something.
So to me the variable Paul is equal to:
I am much of what I’m not, and less of what I am. External, untracked factors make up more of me and my decisions than my perception of myself.
So the next step is to define what I am.
To define what I am, I will use the famous quote from philosopher Rene Descartes “cogito, ergo sum.” I think, therefore I am. So, does that make my conscious thoughts me? Even if they are only a small portion of my entirety?
Well since I can’t be, without thinking, and I have to think to act (for the most part). I guess we can also define part of my actions as what I am.
So going by these boundaries, I am a nomadic tax accountant. Born in England and raised in America. Is that a bit better?
A data scientist is always asking questions skeptically and building on those questions until he can turn something vague or unknown into something a little more known.
Why did I choose data science?
The short answer is: I hate accounting, I hate taxes, and I consider data science more of an art than a science.
The long answer is: I subscribe heavily to the taoistic philosophy of yin yang; order and chaos. I see parallels of this concept in both data science and tax accounting.
What is order? (define parameters)
Order is bordered; nothing can come in, and nothing can go out. It’s limited, it’s known, it’s the accepted, it fears creation because creation is not orderly. Even if you tried to define creation in order as slight rearrangements of that order, you can’t because order doesn’t allow itself to be rearranged.
Metaphorically, It’s a closed guarded tap hovering over a sink that already contains water.
Order tyrannises chaos.
The financial industry is very orderly without much room to deviate or create. I didn’t like this.
What is chaos?
Chaos is creation, it’s new, its life, its death, it’s uncertainty, it’s border-less, and it’s formless. Metaphorically speaking, it’s the unlimited flow of water from an unguarded open tap. Everything that’s not who I am, or what i think i know, is chaos.
Chaos oppresses order.
Chaos is unknown, unstructured data. That’s why I said I am much of what I am not, and less of what I am. Beyond the boundaries of my consciousness, the continuation of myself, that is expressed beyond myself, is hard to keep up with. I am data… whether I track myself or not.
I was a tax accountant, I played by the rules given to me by the IRS every year and never went beyond those borders. There is not much creation in accounting (unless you worked for Enron back in 98).
Data science is taking elements of chaos and funnelling it through a pipeline( parameters of which we define) to make sense of it. It’s essentially having one foot in order and one foot in chaos. The main tool it uses to funnel this chaos is the scientific method. The scientific method acts as an efficient open/closed looped workflow. Observe, hypothesise, collect, analyse, accept or reject.
Chaos is the data, everything mapped about us. From our decisions, to our interactions, to what we will eat for breakfast tomorrow.
Metaphorically speaking, a data scientist acts as a sink that collects the water from that unguarded tap of chaos.
Orderly chaos is data science.
I consider data science as more of an art because in order to create something new, or derive meaning from unknown, we need to take the unstructured or unknown, and structure it in a way that results in a “hopefully” meaningful conclusion.
The famed psycho-analyst Carl Jung said it best when he said:
“The creation of something new is not accomplished by the intellect but by the play instinct acting from inner necessity. The creative mind plays with the objects it loves.” — Carl Jung
I enjoy playing with data and trying to make sense of its chaotic nature.
What do I hope to get out of data science?
I hope to use data science in a way to study human behaviour more deeply and find ways to model this behaviour so that I can help society solve socio economical problems.
I am interested in organisations like Data Kind and their approach to data science for humanitarian good.
As far as industries go, I am still in search of one that will peak my interests and stimulate my inner playful child.