Caveats: I’m much better at assessing negatives and mistakes than observing positives so this list skews heavily towards the former even though on net I enjoyed my PhD. Also, I had an NSF fellowship coming in which definitely alters the dynamics of the PhD.
Overall: I loved doing a PhD. For me, given the background that I came with, (barely knew how to program, didn’t know any optimization or controls) the opportunity cost was extremely low and I loved the opportunity to mostly set my own direction. I have a fair number of complaints but it was an excellent way to spend six years. More relevant to this post, I made a huge number of avoidable mistakes that made it a worse experience than it could have been.
Things that went poorly
At a high level, most of the challenges I faced during my PhD were a side effect of paper-writing mania. It turns out there is in fact such a thing as trying to work too much and that the consequences are pretty bad. The other major failure mode was not recognizing early that the natural equilibrium of a PhD is pretty severe isolation and that you need to intentionally work to avoid it. It turns out that managing your emotions is as core a skill for succeeding in the PhD as being good at research is.
Below I detail some of the different problems these issues caused as well as some ideas on how they could have been avoided.
Too many projects
At the lowest point of my PhD I was attempting to juggle four projects on which I was the primary contributor. I am confident that there are people capable of this, certainly the people who publish 4 first-author neurIPS papers a year must be capable of this, but I am not one of them. Chances are, you’re not one of them either! The worst thing is, once you pick up multiple projects it becomes multiple groups of people depending on you, making it hard to drop any given project due to the added stress of disappointing people you care about. As a rule of thumb I now believe you can maximally have:
- 1 project that you’re the primary writer / programmer / theorem prover on
- 1 project that you have large intellectual contributions to but minimal programming time / expected theorem proving
I would even say the second project is pushing it and you’re better off having one project and using the spare time when that project is stuck to learn, study, and mess around.
It’s worth exploring the underlying psychological reason why the “too many projects” trap happens. I think that knowing why this happens can help you avoid the failure mode. In my experience it was one of three things:
- A strong desire to work with a particular researcher that I admired. There were so many folks whose work I respected and when given the opportunity to work with them, I found it really hard to turn down. In retrospect, this is obviously a huge mistake since what winds up happening is you just disappoint a lot of people you respect when you invariably have to drop the project because having more than 2 projects is not possible for most people.
- Variance reduction. Any given project can fail and so if you have N projects one of them will succeed. The mistake here is not recognizing that the projects are going to fail just because you’re trying to do them all at once and you can’t provide them the appropriate attention needed to make them succeed. To put it otherwise, you might be able to have 4 projects in a year if you go at them one at a time and they all happen to succeed quickly, but if you try to do 4 projects at once they’ll almost certainly all fail due to insufficient focus. Doing it in series instead of parallel appears to work better.
- A failure of recognizing the order in which things occur. One thing to recognize, and that I didn’t know at the time, is that having one astonishingly good paper >> many okay papers. Interestingly, once you have that one great paper, it appears to generate tons of other papers as a consequence due to all the new, interesting avenues that it opens. Had I know this, probably could have avoided this trap. My theory is that I got the order of causation slightly wrong, it’s not that the excellent researchers write a ton of papers from the get-go, it’s that they write one great paper which in turn opens doors to many others.
Low hanging fruit is a nonsense concept
I distinctly remember, at some point in my PhD, saying “it’s a low-hanging fruit paper. We can write it quickly, it’ll be a neat paper, and then we can move on.” It is almost never the case that a paper wraps up quickly. For one of many reasons it’s going to take longer than you expected and if you are not deeply, personally committed to getting an answer to the question posed you will eventually come to hate the paper and the time you have spent on it. There are going to be some real lows in any project and “it’ll be another paper to add to my CV” or “well the result will be pretty cool” is not going to be enough to carry you through those lows without some real despair.
At this point my mental model is that: 1. There’s no such thing as a low hanging fruit paper.
- The absolute only circumstances in which you should write a paper are when the question it answers is deeply personally satisfying. This requires a lot of self knowledge to understand which questions you need an answer to!
This isn’t to say that you only need to work on papers that are huge and impactful. There are many types of papers you can write, some addressing problems of immense importance and some just investigating curiosities. In the end, what turns out to be essential to me is that either the process of working on the topic or the prospect of getting an answer to the question is deeply joyful or satisfying. Unfortunately, this isn’t something you can figure out without serious introspection.
Lack of feedback
One pretty distinct aspect of spending time in industry was being surrounded by a ton of excellent programmers / researchers that I could regularly get feedback from. That’s not to say that the PhD students around me weren’t excellent (they are!) but industry has more direct, consistent feedback mechanisms and clear opportunities for mentorship that help you become, at the very least, a better programmer much faster than staring at your terminal and being totally stuck. There’s a lot of value in learning how to get yourself unstuck, this is one of the many useful skills learned in a PhD, but there are also a lot of places you’ll get stuck that are not actual opportunities for growth but just a waste of time.
Worse, when you’re stuck it’s not as though there’s an obvious person to go ask for help. Generally, your advisor isn’t intended to help with low-level questions and your lab-mates will frequently have similar skill-sets to you. In the large companies I’ve interned at, there was always someone I could turn to when I was sufficiently stuck on a low-level issue and I wound up with significantly less time wasted and learned something to boot.
In internships focused on publishing your mentor can be quite hands on which can be quite useful if your mentor is an expert in their particular subfield. While I can only speak to Berkeley, my observation from talking to many students is that the advisors can be hands-off. Of course, this is often what students prefer but there are distinct advantages lost. Many incoming students view the PhD as an apprenticeship in which they expect to learn directly from their advisor and many of the benefits of apprenticeship are lost with a hands-off advisor.
I think you can get yourself out of this failure mode but it requires a lot of intentionality. You need to actively find forums and places where you can post your issues, set up regular meetings with colleagues to get critical feedback, actively solicit critical feedback from your advisor. There may be labs that have this structure set up but it is not the natural equilibrium of a PhD: you have to set it up.
Lack of interaction
This is likely not the case for everyone but the first half of my PhD was quite isolating (though it is also the case for many folks I’ve spoken to). While I loved (and love) my labmates, I did not feel connected to the broader community at my university and outside of my university. I took most of my courses mostly on my own and generally did not have a sense of what work other folks in my cohort were doing and found it difficult to form collaborations. This changed during COVID, where I wound up with a lot of collaborations in the multiagent RL (MARL) community and with other MARL folks at Berkeley due to significantly lower barriers to just reaching out to and chatting with random folks.
I think a lot of this could have been avoided with more free time and intentionality. In the last two years of my PhD I started a MARL seminar with Natasha Jaques which introduced me to a ton of folks with overlapping interests. This significantly lessened the degree of isolation I felt and opened up a lot of fun collaborations. There wasn’t really anything preventing me from starting or being active in a seminar series much earlier (except that I was unaware many seminar series even existed due to the aforementioned issues) and it probably would have had a similar effect.
Note: some of these issues might just have been because (1) I had mono for my first month at Berkeley and so missed all of orientation and meeting people and all that. (2) I was in the mechanical engineering department but worked primarily in the EECS department. Consequently I simply wasn’t on a lot of mailing lists, didn’t hear about a lot of events, did a different set of prelims from all my colleagues, etc.
Lack of learning
I was so obsessed with writing papers that I definitely spent less time thinking, reading, and learning than some of my colleagues who explicitly devoted the first few years of their PhD to deep thinking and learning and deferred a focus on paper publications to later. In the last couple years I became more explicit about this and set aside time to read and study but the first couple crazed years of “I need to write papers to establish myself” had minimal study and learning time.
I’ve expressed this to a few colleagues and suspect that it’s an extremely common trap. There’s always so much to do and so it’s easy to drop things that require explicit effort to keep on your calendar in favor of the stuff that will force itself onto your calendar. Watch out, this failure is extremely pernicious, easy to fall into, and has far-reaching consequences.
Things that went well
Lab selection process and advising
I went into graduate school knowing that I wanted to work in controls but with extreme uncertainty about the topic because I was so new to the field that I didn’t have the confidence to choose a particular direction (note, this is not normally a set of good conditions under which to do a PhD. You are far better off if you go in with a clear direction). Having accepted that I couldn’t reduce this uncertainty, I instead chose to pick labs on the basis of personal fit and how supportive the professor appeared to be.
This turned out to be a great idea. Every student I spoke to about my advisor unprompted talked about how much backing the professor gave them; one essentially said “Alex will support you no matter what your goals turn out to be.” This turned out to be true; when I decided to focus on RL at a time when there was pretty widespread skepticism about it in the controls community (there still is, just less now), my advisor worked hard to help me acquire compute resources, to connect with groups that could help, basically anything he could do. If you don’t know exactly what topic you’d like to work on in your PhD, I highly recommend just picking a kind, supportive advisor in the appropriate field.
Freedom to “waste time”
If you can get over the fear of needing to be “productive” at all times, you can invest a lot of time into things that are near impossible to do outside of a PhD. Want to spend all day writing a simulator without a clear “customer”? Want to spend your day reading textbooks on an obscure topic that only might turn out to be relevant or reading through the DeepMind codebases to learn their tricks and coding style? Want to spend a month writing Jax tutorials just because? All these things are possible in a PhD if you can manage your emotional state. One summer I spent a month doing a deep dive into Online Learning for no particularly good reason except that I thought it might be useful eventually; I have no idea how I would have justified this in industry except in a few special industry research labs.
Note: “waste time” is in quotes because I don’t believe this is a real waste of time.
Oh did I ever love having undergraduate mentees. They’re curious, enthusiastic, and it’s such a pleasure to watch them become full-fledged researchers and core collaborators. This is definitely in my top-two favorite things about the PhD.
It turns out that I love multi-agent learning and autonomous vehicles! I got to spend most of my PhD working on these topics and now have a faculty position where I’m going to keep working on them. This is basically the ideal outcome of a PhD where it qualifies you to do a type of work that would be challenging for you to do otherwise. This is also the main reason I consider my PhD to have been a net positive; I’m basically set for life in being able to do the type of work that brings me joy (unless the autonomous vehicle industry collapses which isn’t totally impossible). Yes, technically you can get an industry position off the bat and work your way into this type of research work but this is the exception and not the norm.
Meeting great colleagues
My lab-mates are friends for life as well as many of the folks I’ve met in the last few years in the MARL and autonomous vehicle communities. Once I got embedded into a particular community, I got to enjoy in-depth, hyper-specific discussions on topics that I was equally obsessed with the folks in the world who probably know the most on this issue. Academia lets you become hyper-niched in a way that industry often doesn’t and while this can have its downsides, hyper-niched folks are some of the most interesting folks in the world to talk to since they are almost guaranteed to have an extremely unique perspective.
Conferences + Chained Vacations
ML conferences are frequently in really wonderful spots due to the pressure to rotate them from country to country to increase access. I’d always take a few days on either the front or the back end of a conference to travel which got me to some spots I likely wouldn’t have visited otherwise. Also, the more focused conferences are great fun since almost everyone there is qualified to chitchat about the topics you’re interested in. This is less the case at the larger conferences like ICML or CDC where they’re so large that you have to purposefully seek out good conversations.
Do you want to make a ton of side money while learning from experts about how things actually work in practice without being tied down to the less fun responsibilities of a job like meetings? I did
- an internship at Tesla Autopilot that taught me a ton about AVs.
- an internship at DeepMind that let me interact with one of the largest groups of multi-agent learning experts in the world. It also exposed me to a drastically different perspective on how to organize research.
- a visiting research position at FAIR that gave me an unparalleled amount of research freedom amidst a wonderful community. As above, it exposed me to a very different way to organize a research community, something I’m still mulling on.
The internships also helped me feel more comfortable remaining in academia without constantly worrying about whether the grass was greener somewhere else. Industry is great but I quite like where I’m at.