satori as a phase transition

I am a big fan of Absolute Idealism which basically posits that the mind mirrors the reality and the logic of the world is the same as the logic of the mind. (See Hegel.) The world is comprehensible because it too is a mind, and all minds are complex adaptive systems.

The hardest thing to understand is why we can understand anything at all.
- Albert Einstein

There are levels in understanding for the same reason why there are levels in any complex dynamics. Thoughts constitute a world onto themselves and transformative learning experiences create cascading effects that eventually reach to the core of what holds your belief system together. These irreversible experiences (which arise at the moments when you are transitioning to higher levels) are like big earthquakes. They are rare but easy to recognize. (When such earthquakes take place in the collective mind, we call them paradigm shifts.) In Buddhism, Satori is characterized as one such extreme peak experience.

Satori is the sudden flashing into consciousness of a new truth hitherto undreamed of. It is a sort of mental catastrophe taking place all at one, after much piling up of matters intellectual and demonstrative. The piling has reached a limit of stability and the whole edifice has come tumbling to the group, when, behold, a new heaven is open to full survey...

When a man's mind is matured for satori it tumbles over one everywhere. An articulate sound, an unintelligent remark, a blooming flower, or a trivial incident such as stumbling, is the condition or occasion that will open his mind to satori. Apparently, an insignificant event produces an effect which in importance is altogether out of proportion. The light touch of an igniting wire, and an explosion follows which will shake the very foundation of the earth. All the causes, all the conditions of satori are in the mind; they are merely waiting for the maturing. When the mind is ready for some reasons or others, a bird flies, or a bell rings, and you at once return to your original home; that is, you discover your now real self. From the very beginning nothing has been kept from you, all that you wished to see has been there all the time before you, it was only yourself that closed the eye to the fact. Therefore, there is in Zen nothing to explain, nothing to teach, that will add to your knowledge. Unless it grows out of yourself no knowledge is really yours, it is only a borrowed plumage.

D. T. Suzuki - An Introduction to Zen Buddhism (Pages 65)

The contrast between the cultivational mindset of the East and the transactional mindset of the West becomes very stark here. Satori is not a piece of information and enlightenment is not transferrable. This arises immediate jealousy and subsequent skepticism in most unenlightened Western minds. “What do you do different now?” they ask, as if it is possible to instantly reverse-engineer a self-organized criticality that took years of strenuous effort to build.

Transfer of wisdom requires preparedness. Transfer of information does not. (This is why education is so resistant to technological improvements.) Generally speaking, wiser the message, lower the probability of a successful transmission. You can not expect a student to make several jumps at once. True learning always happens one level at a time. Otherwise, internalization can not take place and what is “learned” starts to look more like a “borrowed plumage”.

Wisest thinkers are read the most but retained the least, because we all like taking short-cuts unless someone actively prevents us from doing so.

A good mentor both widens your horizon and restricts your reach. Today’s obsession with individual freedom is preventing parents from seeing the value of restriction in education. They want teachers to only widen horizons, but forget that unbalanced guidance can actually be worse than leaving the students alone and completely self-guided. In a completely free learning environment something magical starts happen: The right path to wisdom starts to self-assemble itself. What a good teacher does is to catalyze this natural self-assembly process. Wrong guidance, on the other hand, is too accelerative (or artificial) and result in the introduction of subjects (and authors) too early for successful retainment. It creates illusions of learning, and even worse turns students permanently away from certain subjects (and authors) because of misunderstandings or feelings of inadequacy.

Do you remember yourself absolutely falling in love with certain books and then falling out of love with them later on? This is a completely natural process. It actually means you are on the right path and making progress. In a sense, every non-fiction book is meant to be superseded, like small phase transitions. (Good fiction on the other hand can stay relevant for a long time.) This however does not mean that you should be less thankful to the authors of those books that you no longer enjoy. They were the necessary intermediary steps, and without them you would not be where you are here today. Of course, the journey looks nonlinear, funny and misguided in retrospect, but that is exactly how all natural journeys look like. Just observe how evolution reached to its current stage, how completely alien and unintuitive the microcosmos is!

And don’t forget, the future (in the world of both thoughts and things) always remains open and full of surprises. Learning is a never-ending process for us mortals. Enjoy it while it lasts.

angel investors as hybrid unifiers

Investors thrive on social validation and follow trends. Innovators thrive on social invalidation and create trends. How can such completely different mindsets meet in the middle and cooperate for a common goal? Leaps of faith? Heaps of information asymmetry? Well, the real answer lies in the existence of a hybrid class of creatures called the angel investors, the most successful of whom are usually entrepreneur-turned-investor types.

Social Decoupling

Since angels manage their own money, they can more easily decouple themselves from the societal expectations, make independent decisions and stay away from consensus driven homogeneities.

Lastly, consensus frequently kills the outliers. Within venture firms, there are countless examples of firms passing on seed-stage deals because they were crazy ideas and consensus didn’t exist, only then to watch them grow into unicorns. Nearly a decade ago, Atlas changed its approach to seed investing in that any individual partner could back a “seed deal” (size delimited) without consensus support. This empowered partners to take some risk. As those investments mature, we generally move towards a team consensus as capital intensity increases. But at the earliest phases of the decision process, when uncertainties are high and getting to consensus could prevent leaning forward into a “risky” seed, we think this approach works.

Bruce Booth - Biotech Startups And The Hard Truth Of Innovation

Contrarian Spirit

Since angels are usually tech entrepreneurs themselves (or were so in the past), they can more easily resonate with the early stage ambiguities and challenges facing an entrepreneur, and be generally more accepting and tolerant. For the same reason, they also exhibit the most contrarian behavior among the investor folks. In fact, they like the cutting edge so much that they actually stay away from the existing trends. They like being right while others are wrong. In other words, they care for a particular kind of social validation. General public does not have the intellectual sensibility required for the epistemological jealousy they are after, but the other angels do. That is why angel investing thrives only in communities.

From the outside, angel investing may look like it’s motivated simply by money. But there’s more to it than that. To insiders, it’s more about your role and reputation within the community than it is about the money… No one can really predict what startups will succeed or fail, no one can really predict what trends are real or illusions, to any genuine degree of confidence. But everyone has to get up every morning and put on a bit of a show: that you are perceptive, you are brilliant, you are contrarian, and you are right.

Alex Danco - The Social Subsidy of Angel Investing

Note that this is not about being contrarian per se. You need to be both contrarian and correct. So angel investing is a stressful affair. It is one of those jobs that most people would not do just for money. That is why angels are given almost the same heroic treatment as entrepreneurs.

Domain Knowledge

Since angels are not restricted by any investment mandates, they can freely explore the edge cases. More importantly, due to their past entrepreneurial experience, they tend to have deep domain expertises that generalist investors by definition lack. Their backgrounds and their networks allow them to significantly de-risk their positions.

Remember, risk is in the eye of the beholder. As a company matures its risk profile becomes more quantifiable, but at the very early stages, its risk profile is very qualitative and resistant to the formulaic approaches thought in business schools. The ideal investor profile required for each early stage company is actually different. That is why (again) you need a whole ecosystem of angels who can handle different types of investment theses and entrepreneur teams. (Personality match is also important.)

I like to picture the dynamics in the tech investment world as follows. There are certain individuals, called entrepreneurs, who are far ahead of the general public. The first tenuous link these people establish with the society is through angels who collectively act like an umbilical cord. As the company matures and passes the test of time, increasingly bigger and increasingly more conservative investors step in to help with the actualization process. Finally, at the very end, the risk profile of the company becomes ready to merge with the general public via an IPO. (Tech companies used to IPO a lot earlier but this was primarily due to the immaturity of the private capital ecosystem. I believe that it is financially more natural and morally more appropriate for these companies to meet the public later in their life cycles.)

aile şirketlerinin evrimi

Türkiye’de sermayenin büyük bir kısmı aile şirketlerinin elinde. Dolayısıyla, bu şirketlerin,

  • nerelere yatırım yaptığı genel ekonomiye yön veriyor, ve

  • değişen şartlara adapte olabilirliği de genel ekonominin evrilebilirliğini, yani gelecekteki sağlığını belirliyor.

Ülkemizde maalesef (biraz kültürel sebeplerden ve biraz da coğrafyanın getirdiği sertlikten ötürü) aile şirketlerinin başlarındaki liderler yeni liderler yetiştiremiyor ve arkalarından gelen jenerasyonun önünü açamıyor.

Bu konuda gözlemlediğim kadarıyla iki önemli hata yapılıyor.


Hata 1: Yönetilmeyi Öğreterek Yönetmeyi Öğretemezsiniz

Bir çok büyük patron, çocuklarının sahip oldukları organizasyonu alttan yukarı doğru öğrenmesini istiyor. Bu isteğin arkadasında bir kaç kaygı yatıyor.

  1. İnsan bilmediği işi yönetemez.

  2. İnsan birlikte omuz omuza çalışmadığı kişiye karşı empati kuramaz ve dolayısıyla onu efektif şekilde yönetemez.

  3. İnsan fanusta yaşarsa şımarır, gerçeklerden uzaklaşır.

Bunlar ne kadar güzel düşünceler olsa da, yönetmeyi öğrenmenin gerçekte tek bir yolu vardır, o da yönetmektir. En alttan başlayıp, inanılmaz detay işlere boğulup, başkaları tarafından yönetilerek (ne kadar iyi gözlem yeteneğiniz olsa da) yönetmeyi öğrenemezsiniz. Yukarıdaki kaygılara gelince…

  1. Yöneterek de insan bir işi sıfırdan öğrenebilir. İllaki masanın öbür tarafına geçmek gerekmez. Zaten işler sadece detaylarda farklılık gösterir, yapısal anlamda hep birbirlerine benzerler. (Aksi takdirde yatırımcılık, profesyönel yöneticilik gibi meslekler olamazdı.)

  2. İnsan yönettiği kişilerle de empatik diyalog kurabilir, illaki onların seviyesine pozisyonel anlamda inmesi gerekmez. Ayrıca empati daha çok karakterle ilgili bir meseledir ve bu konudaki öğrenimlerin çok daha küçük yaşlarda başlaması gerekir.

  3. Genelde patron çocukları iş hayatında fanustan çıksa da, sosyal hayatında fanusa geri döner. (Şirketin kendisi de bir fanus olduğu için zaten gerçek anlamda hiç bir zaman fanustan çıkamaz.)

Peki deneyimsiz birine nasıl yöneticiliği öğretebilirsiniz, koca şirketin başına pat diye geçirerek mi? Hayır, küçük ekipler yönettirerek, küçük hatalar yaptırarak tabii ki!

  • Hatta bu egzersizleri aile şirketi fanusunun tamamen dışında yaptırarak. (Yani, çocuğunuzu büyük bir ekibin içerisindeki küçük bir ekibin başına değil, dışarıda - bağımsız - küçük bir ekibin en başına koyarak.)

  • Hatta sıfırdan kendi ekibini kendisinin kurmasını isteyerek. (Başka türlü insan seçmeyi nasıl öğrenebilir ki?)

  • Hatta bu ekibi ne için kuracağının kararını da kendisine bırakarak. (Başka türlü kendi hedeflerini kendi koymayı nasıl öğrenebilir ki?)

Özetle çocuğunuzdan gidip bir girişim kurmasını isteyin. Büyük bir şirketi yönetmekle küçük bir şirketi yönetmek tabii ki aynı şey değil, fakat yaşanılan problemler genelde aynı problemler, sadece rakamlar (ve dolayısıyla yapılan hataların bedelleri) daha büyük. Yani küçük bir şirketi başarıyla yönetebilmek en önemli yönetimsel becerilerin kazanımını sağlayacaktır.

Ayrıca sıfırdan girişim kurmak çocuğunuzun kendini keşfetmesi, kendi yönetim tarzını oturtması için de faydalı olacaktır. Ona dışarıda özgürlük alanı tanımanız, hata yapmasına izin vermeniz çok önemli. Aile şirketi içerisinde ise çocuğunuzun bir şeyler öğrenmesi çok zor.

  • Güçlü otoritenizin yarattığı baskı altında ezilir, nefes alanı bulamaz.

  • Tüm gözlerin onda olması sahne korkusu, hata yapma korkusu yaratır.

Tabi çocuğunuzu gereksiz yere de süründürtmemeniz lazım. “Ben İstanbul’a geldiğimde cebimde 30 lira vardı. Sürünsün öğrensin, benim geçtiğim yollardan o da geçsin.” mantalitesi yanlış bir mantalite. Burada amaç sizden çok daha hızlı bir şekilde çocuğunuzun aynı öğrenimleri kazanması, işleri başarılı bir şekilde devralıp bir sonraki seviyeye taşıması. Sürünmesi ve sizin çektiğiniz acıları çekmesi, sizin başarınızı takdir etmesi değil.

Çocuğunuza kısıtlı parasal kaynak fakat kısıtsız mentorluk ve kısıtsız network sunmanız en güzeli. Yani şirketinizin tüm deneyimsel bilgi birikiminden ve tüm sosyal ağından faydalanabilmeli. Unutmayın burada amaç yapılacak hataların parasal boyutunu kısıtlamak sadece. Çocuğunuza hiç bir kaynak sunmayıp süründürtmek değil.

Tabi liderlik biraz da karakter meselesi. Zaten kanında liderlik olan bir çocuğa “alttan yukarı doğru öğrenme” metodunu dayatamazsınız. Karakterine uymaz. Genelde liderler biraz uyumsuz tiplerdir, bağımsızlıklarına düşkün, kendi fikirleri olan, yüksek özgüvenli, inançlı insanlardır. Her sözü dinleyenden lider olmaz. (Öyle olsaydı okullar lider yetiştirebilirdi, liderlerin hepsi okullarda en yüksek puan alan çocuklar arasından çıkardı vs.)


Hata 2: Aynıyı Sürdürerek Yeniye Adapte Olamazsınız

Canlılar neden doğum-ölüm süreçlerinden geçer? Çünkü yeniye başka türlü adapte olamazlar. Yaşlandıkça insanın değişmesi zorlaşır, çevre ise değişmeye sürekli devam eder. Genç beyinler ise tazedir, hızla değişen şartlara kolayca adapte olurlar.

Bu yüzden ikinci jenerasyondan birinci jenerasyonun yaptığı işleri devam ettirmesini beklemek yanlıştır. Günümüzün hızla değişen ekonomisinde bu yaklaşım bir aile şirketi için ölüm fermanı demektir. Yeni jenerasyon yeni işlerle uğraşmalıdır. Toplumlar babalarının izinden gitmeyen oğullar sayesinde ilerlerler. (Yoksa bugün hepimiz tarlalarda çalışıyor olurduk!)

Peki o zaman yeni jenerasyona ne öğretmeli? Neyi yapması gerektiği değil, nasıl yapması gerektiği öğretilmeli. İnsan yönetimiyle ilgili, strateji geliştirmeyle ilgili genellenebilir yaklaşımlar öğretilmeli. Külüstürleşmeye yüz tutmuş, belirli bir düzene oturmuş işlerin kendisi değil, zamansız prensipler, gelenekler, felsefeler öğretmeli. İşin kendisi ise evrilmeli, kabuk değiştirmeli, yeni şartlara adapte olmalı. Ancak bu şekilde ilk jenerasyonun yakaladığı büyüme eğrisi korunabilir.

Yeni jenerasyonun yeni işlerle uğraşma isteği, büyüklerine olan saygısızlığından değil, doğanın kendisine olan saygısından ötürüdür. Eski jenerasyonun problemi ise doğanın döngülerini anlayamamasından (daha doğrusu kabul edememesinden), kendi başarısının yarattığı körlükten ve zaman içerisinde işle kurduğu duygusal bağdan kaynaklanır. Oysa evrim yaratıcı yıkım üzerine kuruludur. Kendini sürekli yenilemeyen ölmeye mahkumdur. Liderlik korumaktan değil, yıkıp yeniden yaratmaktan geçer.

doğru zamanda bırakabilmek

Doğadaki her üssel hızda yükselme er ya da geç hız kesip düşüşe geçer. Bu yüzden efsane olmak zordur. Ancak tepedeyken bırakırsanız, grafiğinizi doğru yerde keserseniz efsane olabilirsiniz. Böylece insanlar hayallerinde grafiğinizi (doğal olmayan bir şekilde) üssel hızda uzatmaya eder ve “kim bilir daha neler yapacaktı “ gibi laflar ederler.

Fakat hiç bir babayiğit kariyerini en üst noktasında bırakıp emekliye ayrılmayı, başka bir işle filan uğraşmayı göze alamaz. Bu yüzden efsaneler hep zamansız ölümlerden doğar.

İntihar etmek de çözüm değildir bu arada, çünkü Ciroan’ın da dediği gibi zamanlamasını hiç bir zaman doğru beceremeyiz.

It is not worth the bother of killing yourself, since you always kill yourself too late.
- Emil M. Ciroan

hypothesis vs data driven science

Science progresses in a dualistic fashion. You can either generate a new hypothesis out of existing data and conduct science in a data-driven way, or generate new data for an existing hypothesis and conduct science in a hypothesis-driven way. For instance, when Kepler was looking at the astronomical data sets to come up with his laws of planetary motion, he was doing data-driven science. When Einstein came up with his theory of General Relativity and asked experimenters to verify the theory’s prediction for the anomalous rate of precession of the perihelion of Mercury's orbit, he was doing hypothesis-driven science.

Similarly, technology can be problem-driven (the counterpart of “hypothesis-driven” in science) or tool-driven (the counterpart of “data-driven” in science). When you start with a problem, you look for what kind of (existing or not-yet-existing) tools you can throw at the problem, in what kind of a combination. (This is similar to thinking about what kind of experiments you can do to generate relevant data to support a hypothesis.) Conversely, when you start with a tool, you try to find a use case which you can deploy it at. (This is similar to starting off with a data set and digging around to see what kind of hypotheses you can extract out of it.) Tool-driven technology development is much more risky and stochastic. It is a taboo for most technology companies, since investors do not like random tinkering and prefer funding problems with high potential economic value and entrepreneurs who “know” what they are doing.

Of course, new tools allow you to ask new kind of questions to the existing data sets. Hence, problem-driven technology (by developing new tools) leads to more data-driven science. And this is exactly what is happening now, at a massive scale. With the development of cheap cloud computing (and storage) and deep learning algorithms, scientists are equipped with some very powerful tools to attack old data sets, especially in complex domains like biology.


Higher Levels of Serendipity

One great advantage of data-driven science is that it involves tinkering and “not really knowing what you are doing”. This leads to less biases and more serendipitous connections, and thereby to the discovery of more transformative ideas and hitherto unknown interesting patterns.

Hypothesis-driven science has a direction from the beginning. Hence surprises are hard to come by, unless you have exceptionally creative intuition capabilities. For instance, the theory of General Relativity was based on one such intuition leap by Einstein. (There has not been such a great leap since then. So it is extremely rare.) Quantum Mechanics on the other hand was literally forced by experimental data. It was so counter intuitive that people refused to believe it. All they could do is turn their intuition off and listen to the data.

Previously data sets were not huge, so scientists could literally eye ball them. Today this is no longer possible. That is why now scientists need computers, algorithms and statistical tools to help them decipher new patterns.

Governments do not give money to scientists so that they can tinker around and do whatever they want. So a scientist applying for a grant needs to know what he is doing. This forces everyone to be in a hypothesis-driven mode from the beginning and thereby leads to less transformative ideas in the long run. (Hat tip to Mehmet Toner for this point.)

Science and technology are polar opposite endeavors. Governments funding science like investors fund technology is a major mistake, and also an important reason why today some of the most exciting science is being done inside closed private companies rather than open academic communities.


Less Democratic Landscape

There is another good reason why the best scientists are leaving the academia. You need good quality data to do science within the data-driven paradigm, and since data is so easily monetizable the largest data sets are being generated by the private companies. So it is not surprising that the most cutting edge research in fields like AI is being done inside companies like Google and Facebook, which also provide the necessary compute power to play around with these data sets.

While hypotheses generation gets better when it is conducted in a decentralized open manner, the natural tendency of data is to be centralized under one roof where it can be harmonized and maintained consistently at a high quality. As they say, “data has gravity”. Once you pass certain critical thresholds, data starts generating strong positive feedback effects and thereby attracts even more data. That is why investors love it. Using smart data strategies, technology companies can build a moat around themselves and render their business models a lot more defensible.

In a typical private company, what data scientists do is to throw thousands of different neural networks at some massive internal data sets and simply observe which one gets the job done better. This of course is empiricism in its purest form, not any different than blindly screening millions of compounds during a drug development process. As they say, just throw it against a wall and see if it sticks.

This brings us to a major problem about big-data-driven science.


Lack of Deep Understanding

There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

Chris Anderson - The End of Theory

We can not understand the complex machine learning models we are building. In fact, we train them the same way one trains a dog. That is why they are called black-box models. For instance, when the stock market experiences a flash crash we blame the algorithms for getting into a stupid loop, but we never really understand why they do so.

Is there any problem with this state of affairs if these models get the job done, make good predictions and (even better) earn us money? Can not scientists adopt the same pragmatic attitude of technologists and focus on results only, and suffice with successful manipulation of nature and leave true understanding aside? Are not the data sizes already too huge for human comprehension anyway? Why do we expect machines to be able to explain their thought processes to us? Perhaps they are the beginnings of the formation of a higher level life form, and we should learn to trust them about the activities they are better at than us?

Perhaps we have been under an illusion all along and our analytical models have never really penetrated that deep in to the nature anyway?

Closed analytic solutions are nice, but they are applicable only for simple configurations of reality. At best, they are toy models of simple systems. Physicists have known for centuries that the three-body problem or three dimensional Navier Stokes do not afford a closed form analytic solutions. This is why all calculations about the movement of planets in our solar system or turbulence in a fluid are all performed by numerical methods using computers.

Carlos E. Perez - The Delusion of Infinite Precision Numbers

Is it a surprise that as our understanding gets more complete, our equations become harder to solve?

To illustrate this point of view, we can recall that as the equations of physics become more fundamental, they become more difficult to solve. Thus the two-body problem of gravity (that of the motion of a binary star) is simple in Newtonian theory, but unsolvable in an exact manner in Einstein’s Theory. One might imagine that if one day the equations of a totally unified field are written, even the one-body problem will no longer have an exact solution!

Laurent Nottale - The Relativity of All Things (Page 305)

It seems like the entire history of science is a progressive approximation to an immense computational complexity via increasingly sophisticated (but nevertheless quiet simplistic) analytical models. This trend obviously is not sustainable. At some point we should perhaps just stop theorizing and let the machines figure out the rest:

In new research accepted for publication in Chaos, they showed that improved predictions of chaotic systems like the Kuramoto-Sivashinsky equation become possible by hybridizing the data-driven, machine-learning approach and traditional model-based prediction. Ott sees this as a more likely avenue for improving weather prediction and similar efforts, since we don’t always have complete high-resolution data or perfect physical models. “What we should do is use the good knowledge that we have where we have it,” he said, “and if we have ignorance we should use the machine learning to fill in the gaps where the ignorance resides.”

Natalie Wolchover - Machine Learning’s ‘Amazing’ Ability to Predict Chaos

Statistical approaches like machine learning have often been criticized for being dumb. Noam Chomsky has been especially vocal about this:

You can also collect butterflies and make many observations. If you like butterflies, that's fine; but such work must not be confounded with research, which is concerned to discover explanatory principles.

- Noam Chomsky as quoted in Colorless Green Ideas Learn Furiously

But these criticisms are akin to calling reality itself dumb since what we feed into the statistical models are basically virtualized fragments of reality. Analytical models conjure up abstract epi-phenomena to explain phenomena, while statistical models use phenomena to explain phenomena and turn reality directly onto itself. (The reason why deep learning is so much more effective than its peers among machine learning models is because it is hierarchical, just like the reality is.)

This brings us to the old dichotomy between facts and theories.


Facts vs Theories

Long before the computer scientists came into the scene, there were prominent humanists (and historians) fiercely defending fact against theory.

The ultimate goal would be to grasp that everything in the realm of fact is already theory... Let us not seek for something beyond the phenomena - they themselves are the theory.

- Johann Wolfgang von Goethe

Reality possesses a pyramid-like hierarchical structure. It is governed from the top by a few deep high-level laws, and manifested in its utmost complexity at the lowest phenomenological level. This means that there are two strategies you can employ to model phenomena.

  • Seek the simple. Blow your brains out, discover some deep laws and run simulations that can be mapped back to phenomena.

  • Bend the complexity back onto itself. Labor hard to accumulate enough phenomenological data and let the machines do the rote work.

One approach is not inherently superior to the other, and both are hard in their own ways. Deep theories are hard to find, and good quality facts (data) are hard to collect and curate in large quantities. Similarly, a theory-driven (mathematical) simulation is cheap to set up but expensive to run, while a data-driven (computational) simulation (of the same phenomena) is cheap to run but expensive to set up. In other words, while a data-driven simulation is parsimonious in time, a theory-driven simulation is parsimonious in space. (Good computational models satisfy a dual version of Occam’s Razor. They are heavy in size, with millions of parameters, but light to run.)

Some people try mix the two philosophies, inject our causal models into the machines and enjoy the best of both worlds. I believe that this approach is fundamentally mistaken, even if it proves to be fruitful in the short-run. Rather than biasing the machines with our theories, we should just ask them to economize their own thought processes and thereby come up with their own internal causal models and theories. After all, abstraction is just a form of compression, and when we talk about causality we (in practice) mean causality as it fits into the human brain. In the actual universe, everything is completely interlinked with everything else, and causality diagrams are unfathomably complicated. Hence, we should be wary of pre-imposing our theories on machines whose intuitive powers will soon surpass ours.

Remember that, in biological evolution, the development of unconscious (intuitive) thought processes came before the development of conscious (rational) thought processes. It should be no different for the digital evolution.

Side Note: We suffered an AI winter for mistakenly trying to flip this order and asking machines to develop rational capabilities before developing intuitional capabilities. When a scientist comes up with hypothesis, it is a simple effable distillation of an unconscious intuition which is of ineffable, complex statistical form. In other words, it is always “statistics first”. Sometimes the progression from the statistical to the causal takes place out in the open among a community of scientists (as happened in the smoking-causes-cancer research), but more often it just takes place inside the mind of a single scientist.


Continuing Role of the Scientist

Mohammed AlQuraishi, a researcher who studies protein folding, wrote an essay exploring a recent development in his field: the creation of a machine-learning model that can predict protein folds far more accurately than human researchers. AlQuiraishi found himself lamenting the loss of theory over data, even as he sought to reconcile himself to it. “There’s far less prestige associated with conceptual papers or papers that provide some new analytical insight,” he said, in an interview. As machines make discovery faster, people may come to see theoreticians as extraneous, superfluous, and hopelessly behind the times. Knowledge about a particular area will be less treasured than expertise in the creation of machine-learning models that produce answers on that subject.

Jonathan Zittrain - The Hidden Costs of Automated Thinking

The role of scientists in the data-driven paradigm will obviously be different but not trivial. Today’s world-champions in chess are computer-human hybrids. We should expect the situation for science to be no different. AI is complementary to human intelligence and in some sense only amplifies the already existing IQ differences. After all, a machine-learning model is only as good as the intelligence of its creator.

He who loves practice without theory is like the sailor who boards ship without a rudder and compass and never knows where he may cast.

- Leonardo da Vinci

Artificial intelligence (at least in its today’s form) is like a baby. Either it can be spoon-fed data or it gorges on everything. But, as we know, what makes great minds great is what they choose not to consume. This is where the scientists come in.

Deciding what experiments to conduct, what data sets to use are no trivial tasks. Choosing which portion of reality to “virtualize” is an important judgment call. Hence all data efforts are inevitably hypothesis-laden and therefore non-trivially involve the scientist.

For 30 years quantitative investing started with a hypothesis, says a quant investor. Investors would test it against historical data and make a judgment as to whether it would continue to be useful. Now the order has been reversed. “We start with the data and look for a hypothesis,” he says.

Humans are not out of the picture entirely. Their role is to pick and choose which data to feed into the machine. “You have to tell the algorithm what data to look at,” says the same investor. “If you apply a machine-learning algorithm to too large a dataset often it tends to revert to a very simple strategy, like momentum.”

The Economist - March of the Machines

True, each data generation effort is hypothesis-laden and each scientist comes with a unique set of biases generating a unique set of judgment calls, but at the level of the society, these biases get eventually washed out through (structured) randomization via sociological mechanisms and historical contingencies. In other words, unlike the individual, the society as a whole operates in a non-hypothesis-laden fashion, and eventually figures out the right angle. The role (and the responsibility) of the scientist (and the scientific institutions) is to cut the length of this search period as short as possible by simply being smart about it, in a fashion that is not too different from how enzymes speed up chemical reactions by lowering activation energy costs. (A scientist’s biases are actually his strengths since they implicitly contain lessons from eons of evolutionary learning. See the side note below.)

Side Note: There is this huge misunderstanding that evolution progresses via chance alone. Pure randomization is a sign of zero learning. Evolution on the other hand learns over time and embeds this knowledge in all complexity levels, ranging all the way from genetic to cultural forms. As the evolutionary entities become more complex, the search becomes smarter and the progress becomes faster. (This is how protein synthesis and folding happen incredibly fast within cells.) Only at the very beginning, in its most simplest form, does evolution try out everything blindly. (Physics is so successful because its entities are so stupid and comparatively much easier to model.) In other words, the commonly raised argument against the possibility of evolution achieving so much based on pure chance alone is correct. As mathematician Gregory Chaitin points out, “real evolution is not at all ergodic, since the space of all possible designs is much too immense for exhaustive search”.

Another venue where the scientists keep playing an important role is in transferring knowledge from one domain to another. Remember that there are two ways of solving hard problems: Diving into the vertical (technical) depths and venturing across horizontal (analogical) spaces. Machines are horrible at venturing horizontally precisely because they do not get to the gist of things. (This was the criticism of Noam Chomsky quoted above.)

Deep learning is kind of a turbocharged version of memorization. If you can memorize all that you need to know, that’s fine. But if you need to generalize to unusual circumstances, it’s not very good. Our view is that a lot of the field is selling a single hammer as if everything around it is a nail. People are trying to take deep learning, which is a perfectly fine tool, and use it for everything, which is perfectly inappropriate.

- Gary Marcus as quoted in Warning of an AI Winter


Trends Come and Go

Generally speaking, there is always a greater appetite for digging deeper for data when there is a dearth of ideas. (Extraction becomes more expensive as you dig deeper, as in mining operations.) Hence, the current trend of data-driven science is partially due to the fact that scientists themselves have ran out of sensible falsifiable hypotheses. Once the hypothesis space becomes rich again, the pendulum will inevitably swing back. (Of course, who will be doing the exploration is another question. Perhaps it will be the machines, and we will be doing the dirty work of data collection for them.)

As mentioned before, data-driven science operates stochastically in a serendipitous fashion and hypothesis-driven science operates deterministically in a directed fashion. Nature on the other hand loves to use both stochasticity and determinism together, since optimal dynamics reside - as usual - somewhere in the middle. (That is why there are tons of natural examples of structured randomnesses such as Levy Flights etc.) Hence we should learn to appreciate the complementarity between data-drivenness and hypothesis-drivenness, and embrace the duality as a whole rather than trying to break it.


If you liked this post, you will also enjoy the older post Genius vs Wisdom where genius and wisdom are framed respectively as hypothesis-driven and data-driven concepts.

pain and learning

FAAH is a protein that breaks down anandamide, also known as the “bliss molecule,” which is a neurotransmitter that binds to cannabinoid receptors. These are some of the same receptors that are activated by marijuana. With less FAAH activity, this patient was found to have more circulating levels of anandamide, which may explain her resistance to feeling pain.

... Dr. James Cox, another author and senior lecturer at the Wolfson Institute for Biomedical Research at University College London, said, “Pain is an essential warning system to protect you from damaging and life-threatening events.” Another disadvantage to endocannabinoids and their receptor targets is that poor memory and learning may be unwanted byproducts. Researchers said the Scottish woman reported memory lapses, which mirrors what is seen in mice missing the FAAH gene.

Jacquelyn Corley - The Case of a Woman Who Feels Almost No Pain Leads Scientists to a New Gene Mutation

Pain is needed to register what is learned. As they say, no pain no gain.

You can easily tell that you are not learning much if everything is flowing too smoothly. You take notice only upon encountering the unexpected and the unexpected is painful.

I advise mature students to stay away from well-written textbooks. They are like driving on a wide and empty highway. Typos keep you alert, logical gaps sharpen your mind and bad arguments force you to generate new ideas. You should generally make the reading process as hard for yourself as possible.

Educational progress can be achieved by making either the content or the environment more challenging. If you can perform well under constraints, you will perform even better when the environment normalizes.


Engagement enhances learning not because it increases focus but because it increases grit. Struggle is necessary. If the teaching is not engaging, student will more easily give up on the struggle. The goal is not to eliminate the struggle.

The more confident a learner is of their wrong answer, the better the information sticks when they subsequently learn the right answer. Tolerating big mistakes can create the best learning opportunities.

David Epstein - Range (Page 86)

So the harder you fall the better. The more wrong you turn out to be, the more unforgettable will the experience be. As they say, never waste a good crisis.

People usually go into defensive mode when their internal reality clashes with the external reality. That is basically why persuasion is such a hard art form to master. The radicalized easily become even more radicalized when you try to lay a convincing path to moderation.

Of course, there are times when you need to close up, refuse to learn and stick with your beliefs. World is complex, situations are multi-faceted, refutations are never really that clear. In some sense, every principle looks stupid in certain contexts. The principled man knows this and nevertheless takes the risk, because he thinks that looking stupid sometimes is better than looking like an amorphous mass of jelly all the time. Someone who is constantly learning and therefore constantly in revision mode runs the danger of becoming jelly-like. Sometimes one may need to prefer the pain of resisting to the pain of learning.


The essence of the neuromatrix theory of pain is that chronic pain is more a perception than a raw sensation, because the brain takes many factors into account to determine the extent of danger to the tissues. Scores of studies have shown that along with assessing damage, the brain, when developing our subjective experience of pain perception, also assesses whether action can be taken to diminish the pain, and it develops expectations as to whether this damage will improve or get worse. The sum total of these assessments determines our expectation as to our future, and these expectations play a major role in the level of pain we will feel. Because the brain can so influence our perception of chronic pain, Melzack conceptualized it as more of "an output of the central nervous system.”

Norman Doidge - The Brain’s Way of Healing (Page 10)

Pain is not an objective factor. As with everything else, it is gauged in an anticipatory manner by the mind. If you implicitly or explicitly believe that the associated costs will be greater, your pain will be greater.

Since pain is necessary for learning, this means that learning too is done in an anticipative manner. That is why proper coaching is so essential. The student needs to have some idea about what he desires for the future so that his cost function becomes more well-defined.

When one has no expectation from the future, one is essentially dead and floating, and has reverted back to basic-level survival mode. You need to make yourself susceptible to higher forms of pain. Some of the greatest minds I have met had mastered the art of getting mad and pissed-off. They were extremely passionate about some subject and had cultivated an exceptional level of emotional sensitivity in that area.

weaknesses and biases

All weaknesses arise from certain extremities and all successes are traceable to certain extremities. So defend your extremities, and in order to not suffer from the accompanying weaknesses, choose the environments you walk into carefully. All weaknesses manifest themselves contextually. Learn how to manage the context, not the weakness.

Similarly, your biases are your strengths. Defend them fiercely. They are what differentiates you from others. Thinking is methodological. Creativity is overrated. (Both can be learned.) What matters is the input and input is shaped by biases.

complexity and failure

Complex structures that are built slowly over time via evolutionary processes (e.g. economies, companies, buildings, species, reputations, software) tend to be robust, but when they collapse, they do so instantly.

In the literature, this asymmetry is called the Seneca Effect, after the ancient Roman Stoic Philosopher Lucius Annaeus Seneca who said "Fortune is of sluggish growth, but ruin is rapid".

Some remarks:

  • That is why only highly educated people can be spectacularly wrong. Only with education can one construct contrived highly complex arguments of the type which can fail on several different levels and lead to a spectacular failure. (Remember, the most outrageous crimes in history were carried out in the name of complex ideologies.)

  • That is also why good product designers think hard before beginning a design process that is sure to complexify over time. Complex designs collapse in entirety and are very difficult to salvage or undo. Similarly, good businessmen think hard before opening a new business since the decision to close one later is a much harder process.

  • Once entrepreneurs start building a business, they immediately start to suffer from sunk cost and negativity biases, which are specific manifestations of the much more general asymmetry between construction and destruction. We tend to be conservative with respect to complex structures because they are hard to build but easy to destruct. (Unsurprisingly, these psychological biases look surprising to the theoretical economists who have never really built anything complex and prone-to-failure in their lives.)

physics as study of ignorance

Contemporary physics is based on the following three main sets of principles:

  1. Variational Principles

  2. Statistical Principles

  3. Symmetry Principles

Various combinations of these principles led to the birth of the following fields:

  • Study of Classical Mechanics (1)

  • Study of Statistical Mechanics (2)

  • Study of Group and Representation Theory (3)

  • Study of Path Integrals (1 + 2)

  • Study of Gauge Theory (1 + 3)

  • Study of Critical Phenomena (2 + 3)

  • Study of Quantum Field Theory (1 + 2 + 3)

Notice that all three sets of principles are based on ignorances that arise from us being inside the structure we are trying to describe. 

  1. Variational Principles arise due to our inability to experience time as a continuum. (Path information is inaccessible.)

  2. Statistical Principles arise due to our inability to experience space as a continuum. (Coarse graining is inevitable.)

  3. Symmetry Principles arise due to our inability to experience spacetime as a whole.  (Transformations are undetectable.)

Since Quantum Field Theory is based on all three principles, it seems like most of the structure we see arises from these sets of ignorances themselves. From the hypothetical outside point of view of God, none of these ignorances are present and therefore none of the entailed structures are present neither.

Study of physics is not complete yet, but its historical progression suggests that its future depends on us discovering new aspects of our ignorances:

  1. Variational Principles were discovered in the 18th Century.

  2. Statistical Principles were discovered in the 19th Century.

  3. Symmetry Principles were discovered in the 20th Century.

The million dollar question is what principle we will discover in the 21st Century. Will it help us merge General Relativity with Quantum Field Theory or simply lead to the birth of brand new fields of study?

waves of decentralizations

Evolutionary dynamics always start off well-defined and centralized, but overtime (without any exception) mature and decentralize. Our own history is full of beautiful exemplifications of this fact. In historical order, we went through the following decentralization waves:

  • Science decentralized Truth away from the hegemony of Church.

  • Democracy decentralized Power.

  • Capitalism decentralized Wealth.

  • Social Media decentralized Fame away from the media barons.

Today, if you are not powerful, wealthy or famous, there is no one but to blame yourself. If you do not know the truth, there is not one but to blame yourself. Everything is accessible, at least in theory. This of course inflicts an immense amount of stress on the modern citizen. In a sense, life was a lot easier when there was not so much decentralization.

Note how important social media revolution really was. Most people do not recognize the magnitude of change that has taken place in such a short period of time. In terms of structural importance, it is on the same scale as the emergence of democracy. We no longer distinguish a sophisticated judgment from an unsophisticated one. Along with “Every Vote Counts”, now we also have “Every Like Counts”.

Of course, the social media wave was built on another, even more fundamental decentralization wave, which is the internet itself. Together with the rise of internet, communication became completely decentralized. Today, in a similar fashion, we are witnessing the emergence of blockchain technology which is trying to decentralize trust by creating neutral trust nodes with no centralized authority behind them. For instance, you no longer need to be a central bank with a stamp of approval from the government to a launch a currency. (Both internet and blockchain undermine political authority and in particular render national boundaries increasingly more irrelevant.)

Internet itself is an example of a design, where robustness to communication problems was a primary consideration (for those who don't remember, Arpanet was designed by DARPA to be a communication network resistant to nuclear attack). In that sense the Internet is extremely robust. But today we are being introduced to many other instances of that technology, many of which do not follow the decentralized principles that guided the early Internet, but are rather highly concentrated and centralized. Centralized solutions are almost by definition fragile, since they depend on the health of a single concentrated entity. No matter how well protected such central entity is, there are always ways for it to be hacked or destroyed.

Filip Piekniewski - Optimality, Technology and Fragility

As pointed out by Filip, evolution favors progression from centralization to decentralization because it functionally corresponds to a progression from fragility to robustness.

Also, notice that all of these decentralization waves initially overshoot due to the excitement caused by their novelty. That is why they are always criticized at first for good reasons. Eventually they all shed off their lawlessness, structurally stabilize, go completely mainstream and institutionalize themselves.