Routines and New Experiences

273 words • reading time 1 minute

I’ve always found that making life changes, even if temporary, aren’t easy to adjust to. For someone like myself who’s comfortable following routines, having your entire schedule being thrown off by other things you have to worry about isn’t easy to deal with. Routines, as I mentioned in my post titled Memories and Time, actually makes time appear to go faster as similar events become compressed in memory. So why do I still follow routines? Well, I’ve always felt strongly about doing something consistently if you want to learn something properly. The best way to build consistency is by following routines. Fundamentally, the key is find a balance between following routines and having new experiences. Following a entire daily routine isn’t going to help nurture your creativity, yet, not doing anything consistently won’t lead to effective learning.

It seems that people fall into one of the two camps – routine-freaks and experience seekers. Some may argue that it doesn’t matter which camp you fall into – shouldn’t we accept people for who they are? That argument is valid, but what we need most in today’s world are problem solvers. Ever increasing political tension, systemic poverty and overpopulation are just some of the problems that we are facing today. Solving these problems requires both an understanding of specific subject matters and creativity. In short, we need smart creative people. The best way to become such a person is to follow some sort of a routine but also allow yourself time for creative tasks. Creativity can be anything from drawing, going outside to take photos, or just thinking about ideas that solve real problems.

Supervised Learning: Logistic Regression

527 words • reading time 3 minutes

Logistic Regression

Logistic regression solves the classification problem. The key difference between logistic regression and linear or nonlinear regression is that the values y are discrete. For example, say I wanted to predict whether it would rain on a certain day given the barometric pressure recorded every morning. Here, the value y can only take on 0 (no rain) or 1 (rain).

Hypothesis Representation

The hypothesis function for logistic regression differs from its counterpart in linear regression. As opposed to the hypothesis function predicting the value y, it predicts an estimated probability that y = 1 based on the input x. The equation of hypothesis function is the follows:

Hypothesis Function
h θ(z) = 1/(1+e -z) where z = θ T x

The hypothesis function is a sigmoid function (see image below), where the value of z is equal to that of hypothesis function for linear regression.

A decision boundary can be created since we predict y = 1 if h θ (z) >= 0.5 and likewise y = 0 if h θ (z) < 0.5. Combining this with information from the graph above, we can conclude that:

Decision Boundary
y = 1 when θ T x ≥ 0 and y = 0 when θ T x < 0

Cost Function

Since linear regression isn’t being used, the logistic cost function changes to the following:

Cost Function
J(θ) = -1/m Σ from i=1 to m [y (i) log h θ(x (i)) + (1-y (i)) log (1-h θ(x (i)))]

An easy way to prove to yourself that this equation makes sense is by playing around with hypothetical values of y and h θ (x) and observing the penalty if the two are in agreement or not. For example, if y is 1 and h θ (x) = 0.1 then the penalty is log (0.1). If y is 1 and h θ (x) = 1 then the penalty is 0.

Gradient Descent

The algorithm for gradient descent doesn’t change compared to linear regression, but the computation will slightly differ due to the different hypothesis function.

Gradient Descent
repeat until convergence {
θ j := θ j – alpha (1/m)(Σ from i = 1 to m (h θ (x) (i) – y (i)) x j (i))


As with linear regression, regularization can be performed to prevent over-fitting. The regularized cost function and gradient descent algorithm are as follows:

Regularized Cost Function
J(θ) = -1/m Σ from i=1 to m [y (i) log h θ(x (i)) + (1-y (i) log (1-h θ(x (i)))] + λ/(2m) Σ from i=1 to m θ j 2

Regularized Gradient Descent
repeat until convergence {
θ 0 := θ 0 – α (1/m)(Σ from i = 1 to m (h θ (x) (i) – y (i)) x 0 (i))
θ j := θ j – α (1/m)(Σ from i = 1 to m (h θ (x) (i) – y (i)) x j (i) + λ/m θ j )

Again, these equations are identical to the ones for linear regression, except the function, h θ (x), is now the sigmoid function.

Supervised Learning: Regression

853 words • reading time 5 minutes

I thought I’d go over an important supervised learning technique, linear regression. For those not familiar with machine learning, ‘supervised’ implies that historical data is given to build a prediction model. I’ll follow up this post with another one on logistic regression, which is a prediction model for discrete data as opposed to continuous data. Let’s get started!


In layman terms, regression is fitting a function to a set of data. For example, if I had data on the price of a specific model of a car and its mileage I could fit a line that could ‘predict’ the price given a car’s mileage. Evidently, there are many other aspects that influence the price of a car which should be included in the model to improve predictions. In a two-dimensional case, the fit is a line and in the three-dimensional case, the fit is a plane. Visualizing beyond three dimensions is nearly impossible. In machine learning, the function of the fit is called the hypothesis function and is denoted h theta (x). For example, if were fitting a simple straight line to a set a data the hypothesis function would be the following:

Hypothesis Function

h θ (x) = θ 0 + θ 1 x

To find out the parameters, theta 0 and theta 1, there are two main strategies you can use, gradient descent or the normal equation.

Gradient Descent

Gradient descent involves the successive approximation of parameters by subtracting the partial derivative of the cost function multiplied by some learning rate from the previous estimate of each parameter.

The cost function is sum of residuals squared times the 1 /2m term where m is the number of points in the data set. Mathematically, the cost function J is defined as :

Cost Function

J( θ 0 , θ 1 ) = (1/2m) Σ from i = 1 to m (h θ (x) (i) – y (i))

And the gradient descent algorithm looks like this:

Gradient Descent

repeat until convergence {
θ j := θ j – α (partial with respect to θ j of J( θ 0 , θ 1 ))

By taking the partial of the cost function the algorithm becomes the following:

Gradient Descent

repeat until convergence {
θ j := θ j – α (1/m)(Σ from i = 1 to m (h θ (x) (i) – y (i)) x j (i))

There are two thing to take note of. The first is that the parameters are changed simultaneously, and the second is that the learning rate has to be carefully chosen such that it isn’t too small or big. If it’s too small, the algorithm will converge very slowly, and if it’s too large there’s a chance it won’t converge.


One of the problems that could occur when there are many features/variables is over-fitting. Over-fitting is to be avoided because your model loses its predictive ability which is the whole purpose of developing a model in the first place. To prevent over-fitting, one method is to apply regularization. The idea is to minimize the values of the parameters by adding the sum of squares of the parameters (theta). The cost function changes to the following:

Regularized Cost Function
J( θ 0 , θ 1 ) = (1/2m) Σ from i = 1 to m (h θ (x) (i) – y (i)) + λ Σ from i=1 to m θ j 2

Note the added term λ Σ from i=1 to m θ j 2, where lambda is parameter set by the user. Also, notice that the first parameter, θ 0, isn’t penalized. Since the cost function changes, the gradient descent algorithm also has to be updated:

Regularized Gradient Descent
repeat until convergence {
θ 0 := θ 0 – α (1/m)(Σ from i = 1 to m (h θ (x) (i) – y (i)) x 0 (i))
θ j := θ j – α (1/m)(Σ from i = 1 to m (h θ (x) (i) – y (i)) x j (i) + λ/m θ j )

Normal Equation

The normal equation is more familiar to those who’ve taken an undergraduate statistics course. The method used to derive the normal equation is called least squares estimation. On a high level, the idea is to minimize the sum of the residuals (h theta (x) – y; where y is the actual value of each point and h is the fitted value) by taking the derivative of the cost function J, and setting it to 0. Solving for theta results in the follow matrix expression:

Normal Equation

θ = (X T X) -1 X T y

The advantage of using the normal equation is that you don’t need to choose alpha and don’t need to iteratively solve for the parameters. However, the (X T X) -1 term actually has 0(n 3) complexity (n here represents the number of different features or variables x), which means it’s better to use gradient descent when you have a lot of features.

Time For The Stars

361 words • reading time 2 minutes

I recently finished an interesting science-fiction novel called Time For The Stars written by the great Robert Heinlein. This was the first novel I read from Heinlein, and it was far from a disappointment (I plan on diving into his other novels this summer). The plot centered around identical twins, Tom and Pat, who were able to telepathically communicate with one another. Both of them never knew they were telepathic until they were tested by the Long Range Foundation (LRF), an organization that funded expensive long-term projects that would benefit the human race. The LRF at the time was conducting research on telepaths which was only possible among a small minority of twins and close relatives. It was part of a larger project to send a dozen ships across space to find more habitable planets. Since the novel took place well into the future (no date was specified but several other planets were already inhabited by humans), population growth continued to put significant stress on the Earth’s resources and productive capacity.

It was planned that each of the ships would in addition to researchers and engineers have one of each of the telepaths – one would stay on Earth and the other would travel on one of the ships so that communication could be maintained between Earth and the ships. There was one catch though – due to relativity time would go slower for those on the ships compared to the Earth (try to think back to Einstein’s time dilation equation for those who took high school physics). Most of the novel focused on Tom, one of twins that was on the expedition and his crew on the Elsie. I won’t spoil the rest of the novel, but the most interesting part to me was how Tom’s relationship evolved with his brother as weeks/years passed by on the ship. Imagine, if in one week everyone close to you suddenly aged 20-30 years but you remained the same. Now, imagine if in one year the world suddenly fast-forwarded a century. Crazy right? I full-heartedly recommend this novel to any fan of science-fiction or any one looking for a nice short read.

My Goals This Summer

624 words • reading time 3 minutes

With the completion of yet another year at university, an exciting and busy summer awaits. One of few perks of going to university are the four month summers (for those in Canada), which means there’s plenty of time to have fun and learn something new. Even if you’re interning during the summer, there’s still plenty of time for exploration and learning. However, without a plan on what you want to accomplish, you’ll end up looking back wondering where all that time went. Prior last year, most of my summers were spent unplanned which in retrospect prevented me from learning skills that I would’ve liked to have today. Having several goals and a clear plan on how you will achieve them is one of the most important steps towards accomplishing them. Obviously, execution is still key, but once you get into a routine it gets easier.

I thought I’d share my plan this summer for skills that I’d to develop and how I plan on achieving them. Hopefully, by writing things down I’ll have something holding me accountable if I don’t put in the effort to achieve any of them. Since I’ll be interning with Mozilla this summer in San Francisco, I’ve also tried not to set the bar too high but just at what I think is the right level.

Get Better at Writing

I never particular enjoyed English class in high school (analyzing Shakespeare at such a deep level killed my interest in literature and writing), but having the freedom to share your thoughts actually became sort of fun. I started blogging last summer where my focus was on analyzing companies by going over their financial statements, but I stopped after the summer because I loss interest. I then started this blog in February with no specific focus but more as a ‘professional’ diary. I know my writing still has a long way to go, but my plan is to post at least once a week on Sunday about something related to the topics of global development, general philosophy, sometimes math (sorry for those who have aversions towards math), and other miscellaneous topics. Hopefully, over time, I develop some sort of niche in terms of the topics I write about.

Learn to Draw

Just like writing, as a child or during high school I never found much interest in drawing, but I feel like picking up this skill will improve my creativity and give me another tool when trying to communicate ideas. From what I’ve heard, consistency is really key if you want to improve, so my plan is to spend around 2 hours 3-4 times a week drawing and continuously getting some type of feedback from communities on Reddit.

Take as Many Photos as I Can

It’s been over a year since I purchased my used Canon Rebel Xsi, but I haven’t been able to consistently take photos since then. I know the basics of photography (ISO, aperture, shutter speed), but the next step is actually being able to take really nice photos. My plan is to spend 3-4 days a week taking photos in the city. I’m also considering joining meetups since it’d give me the opportunity to learn from more experienced photographers.

Learn Machine Learning and ‘Data Science’

I signed up and have already started listening to the lectures for the Machine Learning course on Coursera. The Data Science course also starts today and goes until early July. I’m pretty excited for these two courses since I’m looking to get better at building my data analysis skills which I can apply on personal projects. Both these courses end in July, so for the rest of the summer I’ll be working on some type of project.

Overpopulation and Our Future

654 words • reading time 3 minutes

There are currently 7 billion people living on the earth. In ten years, projections have the population inflating to 8 billion 1. More people means more waste, less space, and list of other challenges including providing ubiquitous access to health services and effective urban planning. But the real question is if our earth can actually support this many people. As we’re already pushing it to its limits in terms of essential resources (food, water and oil), can we (those in North America) sustain our current way of living?

What prompted me to write this post was an hour long documentary that I watched on BBC Horizon titled, How Many People Can Live on Planet Earth (if you enjoy listening to David Attenborough’s voice that’s a good enough reason to check it out). Attenborough first laid out the facts, that population growth only started to grow exponentially about 500 years ago (see image below). This was attributed to advances in medical health which resulted in people no longer dying from easily preventable diseases relative to today.

Thomas Malthus was one of the first scholars to suggest that uncontrolled population growth is not sustainable given a fixed amount of resources that the Earth can provide. In his famous paper titled, “An Essay on the Principle of Population”, Malthus declared that “The power of population is indefinitely greater than the power in the earth to produce subsistence for man” 2. Humans naturally have an affinity to reproduce, and so Malthus described positive checks that would increase death rates (disease and war) and preventive checks that reduce birth rates (abortion, prostitution, getting married at an older age). These checks would by definition stabilize the population and prevent it from increasing exponentially.

Although Malthus’ beliefs were and still are debated, there’s some evidence supporting his claims. Attenborough mentioned that freshwater levels around the world have been on the decline in the past one hundred years. Water he says, has the potential to become like what oil is today, scarce and valuable. In Mexico City for instance, scarce water access has become such a huge issue that water trucks have to drive around the city to supply water to families and businesses every day. Water isn’t just used for drinking and cleaning, in fact, 70% of freshwater is used for irrigation 3. The relationship between food and water makes the population issue an even greater concern.

The challenge with overpopulation is coming up with an effective solution. Countries like China have implemented one child policies to control their population (see image below). Although effective at keeping the population in check, it’s introduced more problems such as an increase in the number of female orphans. In the 1970s, the Indian government introduced incentives for vasectomies, but it then started to punish criminals by sterilization until the public fought back. Clearly, political and economic solutions aren’t perfect. A better solution, Attenborough argues, is to educate the population since educated families typically have smaller families. Education, along with access to contraceptives has the potential to keep the population at moderate levels.

Some people may argue that we each have a moral responsibility to conserve earth’s resources. Others believe that we have ability to innovate our way out of problems arising due to overpopulation and that we can continue to consume at our current rate. The truth is that we do have to take care of the earth, but we also have the capacity to figure out a way to accommodate more people. The important thing is that we start thinking about the problem and how to grow sustainably so our environment doesn’t become completely ravaged. If we neglect this problem and continue on with our current way of living this problem can have profound consequences for us and the earth.

1UN World Population Report
2An Essay on the Principle of Population
3UN Water Statistics

Week Nine - Final Thoughts

290 words • reading time 1 minute

I’m currently taking an online course on AIDS at Coursera. Every week I’ll be summarizing what I learned. If you have time, I recommend that you take the course as well.

This is it, the final week. This post won’t be as long as previous ones, and I’ll talk a little more about my thoughts on the course. Before I continue, I just wanted to mention that Coursera is a great educational platform and if you’re a student, I strongly recommend taking a few courses to supplement your education (for me, there were certain elective courses that I couldn’t take because of scheduling conflicts).

After 9 weeks, I felt that the course covered a wide range of material from the biology of the disease to different preventive and treatment measures being applied. Taking this course made me realize how difficult it is to solve the HIV/AIDS epidemic. Tackling the problem needs to account for all solutions, from biological to behavioral and political. It’s not enough to simply find a cure or vaccine (although that would be a great step towards eradicating the disease), since you also need to distribute the drugs effectively.

Moving forward, I think more people need to be educated on the topic of HIV/AIDS. Given how much an impact it’s made on the world, it’s important that everyone knows not only what it is from a biological standpoint but also what’s being done to reduce the number of deaths. The more we know, the easier it is to contribute based on our skillset. Like many other big problems, it won’t be solved by a single or handful of individuals, but through the collaboration of people with different skills each contributing to larger piece of the solution.

Memories and Time

637 words • reading time 3 minutes

There was this wonderful BBC series about time that I watched over the past week documented by theoretical physicist Michio Kaku. The series was divided into three parts – lifetime, earth time, and cosmic time. I’m only going to delve into the first video, since I found this one to be the most interesting (not that I didn’t like the other ones).

A simple but interesting experiment Kaku performed was to time random people from all ages counting to 60. Older people on average counted to 60 in more than a minute, while younger people did so in slightly less. What this means is older people have slower internal clocks than younger people, or more simply, time goes by faster as you age. The biological effects as we age get compounded with our routine-based lives making life appear to go by even faster. The past 7 years in my life have seemed to just pass by and I’ve just turned legal (in the US). I think a lot of this is attributed to the fact that each day has been relatively the same – wake up, go to school, study, and sleep. Other than weekends where my activities slightly deviated from weekdays and the odd trip/vacation that I took, that 7 year span seems to have compressed into half that time. Kaku suggested that making slight changes to your actions throughout the day can help slow the appearance of time. For instance, visiting a new park and socializing with strangers, or maybe trying to learn something new. The point he was trying to make was that you need to have a steady flow of new experiences in order to slow down time.

Another interesting point that Kaku mentioned was that humans are the only species that are aware of their mortality. I believe this is still somewhat debated among psychologists, since elephants1 and dolphins2 have been found to mourn the dead. Still, there hasn’t been evidence that any other animal knows that death inevitable (they simply know that death can occur), which puts us in a unique position. Knowing of our eventual demise maybe isn’t something we would’ve wanted to know, but knowing instills a sense of urgency to value our short time on this world. We may get caught up in the day-to-day of our busy lives, but the awareness that time is running out forces us to re-evaluate the decisions we make.

The final part of the video dealt with the possibility of living forever or a long time. There already exists organisms like yeast that can live forever if given something to feed on. It was also found that manipulating certain genes in nematodes can prolong their lifespan. The real question is if what let yeast live forever and nematodes live longer can be applied to humans. There already exists life-prolonging chemicals like anti-oxidants found in certain fruits and vegetables that can mitigate the effects from free radicals which damage genetic material. Aubrey de Grey, one of the prominent figures leading the fight against aging thinks that in around 50 years we can live until we’re 1000. For me, living that long of a time really depends on what that extra life would be spent doing. I don’t think anyone would want to stay in a retirement home for 900 years, but if somehow my health would deteriorate at a much slower rate and I had something meaningful to do then why not live longer? I think that scientists should always be thinking of the social repercussions of what their trying to achieve. Living for a long time isn’t inherently a good thing if society cannot adequately support the needs of the people or if these people fail to find happiness during these extra years.

1Link 1
2Link 2

Week Eight - HIV Challenges and a Cure

669 words • reading time 3 minutes

I’m currently taking an online course on AIDS at Coursera. Every week I’ll be summarizing what I learned. If you have time, I recommend that you take the course as well.

Boy, how time flies! I just finished the lecture for the second last week and finals at University are also starting this week. The topics covered this week were HIV challenges and a cure. The challenges was a general summary of all the topics touched on throughout the course while the section on a cure focused on current strategies and those that are being developed.

If you’ve been taking the course or following my weekly posts, it’s probably been obvious of how difficult it is to prevent and treat those infected with HIV. Not only does it take a long time before symptoms manifest, but the social complexity adds a completely new level of complications for those trying to provide help. The main challenges listed by Hagen’s colleagues were:

  • Political willpower
  • Molecular factors affecting transmission
  • Overcoming stigma and prejudice
  • Political acceptance of CVCT (couples voluntary counseling and treatment) as an effective prevention
  • Developing Effective biomedical interventions for women
  • Reducing cost of TaSP, PrEP, and increasing use of microbicides
  • Overcoming fatigue, denial, and epidemiology
  • Recruiting and maintaining HIV+ adolescents in care
  • Mitigating the treatment cascade
  • Reducing treatment stigma, increase health literacy
  • Increasing diversity among clinical trial volunteers
  • Overcoming scientific obstacles
  • Increasing support beyond public health organizations

In addition, Hagen also listed what she thought were challenges moving forward:

  • Awareness – Not enough people are aware that they’re at risk or even have HIV
  • Education
  • Diagnosis
  • Prevention – Increasing access to interventions
  • Vaccines – Is one even feasible?
  • Access to care – Can we not only increase access, but also do so in a cost effective manner?
  • Drugs
  • Research – Vaccine and antiretroviral research is still relatively new
  • Eradication – More reliable tools for diagnosis and intervention

One of the most important questions to ask is why a cure is needed, given that there are already behavioral and biological prevention options that are becoming more accessible.
The problem with current prevention and treatment options is that they aren’t perfect – HIV is a lifelong disease and even highly effective treatment procedures like ART require lifelong adherence (the time to eradicate HIV on HAART is around 73.4 years).

There are two types of cures – a functional cure where HIV is still found in reservoirs (dormant) but not actively replicating and a sterilizing cure where HIV is completely eliminated from the body. In both cases, HIV will not cause AIDS. There is some individual variation on how the body responds to infection and certain people have the ability to suppress or control viral loads in their body for some duration of time. Long term non-progressors (LTNP) such as super elite controllers can live with undetectable viral loads for long periods of time (sometimes for their entire life). Viremic long term non-progressors have viral loads at detectable levels, but the disease progresses slowly. Viral load controllers can live without AIDS and being HIV+ for some time (not as long as LTNP) before viral loads start to increase.

Curing HIV is difficult for two main reasons – reservoir persistence where infected cells remain dormant and ongoing immune activation which maintains CCR5 expression (making cells more prone to being attacked by the virus). Current strategies for a cure are focused on a few areas:

  • Treatment intensification – This was found to have no effect
  • Early treatment with ART – Reduces viral loads but not at the stage where it can be considered a cure
  • Eliminate latently infected cells – HIV reactivated and homeostatic proliferation occurred (T-cells reproduced)
  • Make cells resistant to HIV – So far has been found to have no effect also

Newer strategies are focused on killing infected cells, combination treatment, and a multipurpose therapy.

Week Seven - Vaccines

496 words • reading time 2 minutes

I’m currently taking an online course on AIDS at Coursera. Every week I’ll be summarizing what I learned. If you have time, I recommend that you take the course as well.

This week’s set of lectures centered around vaccines – the history of vaccines, what they are, and the process for developing a vaccine today. Some of what was covered brought me back a few years to high school biology when we learned about techniques for injecting genetic material into another organism. Let me briefly go over some the key points from the lectures.

Some people might think that behavioral interventions along with treatment as prevention (TaSP) are good enough solutions that a vaccine is not needed. This isn’t true because behavioral interventions are difficult to implement on a wide scale, especially in rural populations where people lack proper education, and treatment as prevention requires the patient to adhere to the medication or treatment for potentially years. A vaccine is a more sustainable solution towards preventing people from being infected by HIV.

So what exactly is a vaccine? A vaccine either does two things – teaches your immune system to protect itself from a specific disease prior to infection, or teaches your immune system to combat against already present infection. More simply, Dr. Hagen (the teacher of the course) suggested that “vaccination is like training an army or providing your personal security detail with a photo of the bad guys” 1. They’re six main types of vaccines 1:

  1. A weakened germ
  2. A killed germ
  3. A subunit of the germ’s protein structure
  4. DNA injected into the host cells (plasmids produce antigens that the immune system learns to recognize and mount a defense against)
  5. Recombinant vectors (a vector is a weakened strain of a bug that is good at putting the immune system on high alert)
  6. Prime/Boost (A two step technique – DNA ‘prime’ followed by a recombinant vector ‘boost’)

The word vaccine derives from the word ‘vacca’ which means cow. Dr. Edward Jenner developed the first vaccine to prevent smallpox when he noticed that those initially infected with cowpox, a more mild form of smallpox, became immune to smallpox 2. Today, the development of a vaccine undergoes a much more strict process compared to the past. The typical phases before providing the vaccine to the public go as follows 1:

  • Phase I: 20-100 subjects. The goal is to test for side effects and dosage. Typically lasts 12-18 months
  • Phase II: Hundreds of subjects. Testing for tolerability and reaction to immune system to the vaccine. Lasts over 2 years.
  • Phase III: Thousands of subjects. Testing for whether vaccine works in the real world. Lasts 3-5 years.

A big challenge today is getting people to participate in testing programs. Those involved in testing need to engage with the community and properly market the study in order to attract as many people as possible.

1 Link 1
2 Link 2