I worked for years on the modernization of a critical banking infrastructure system. The problems are manifold. The biggest problem is overall architectural impedance mismatch - switching from a batch system (most of those old COBOL mainframe apps are batch systems) to a service-oriented architecture. This makes incremental replacement extremely difficult. But a cutover? On a system that moves more money in a day than the GDP of most countries? A failure could put the entire banking world in a tailspin, cause an economic crash.
Next, there's a lot of business logic basically embedded in the COBOL, or even at lower layers. For example, lots of banking files are in EBCDIC, a different character set from ASCII. Except there are lots of different EBCDIC variants, and there's no good way to tell which one you're viewing just by looking at the file. So you have to reverse-engineer COBOL to figure out the "correct" meaning of a given file.
The problems go on and on. When I see people in the startup world rolling their eyes at the "incompetence" of the enterprise world, I take it to mean they've never actually worked on a truly hard problem in their lives.
> When I see people in the startup world rolling their eyes at the "incompetence" of the enterprise world, I take it to mean they've never actually worked on a truly hard problem in their lives.
I worked in a related field in the past. I don't claim that the problems are easy - where I tend to roll eyes rather is:
- Bureaucracy: A lot is required by law, I know. But there is a difference between "just following the required bureaucracy in a minimal necessary way if it stands in the path" (the startup way) vs. "taking it seriously in a way that makes the work harder than strictly necessary".
- Hierarchies: Just three words: I hate them.
- Unwillingness (?) to tackle these hard problems: The problems are hard, as you already outlined. But this implies to me that everybody in the company will move heaven and earth so that the people who work on this hard problems are able to (e.g. give all necessary information (requirement specification etc.) they know of etc.). If there is just one thing (or even office politics) which will prevent that, I don't just roll my eyes, but get furious. Because of the hardness this is not to consider an obstacle, but targeted sabotage - and should be considered as such.
Part of this comes from an inability in the early years to truly respect engineering as both a craft and as a form.
I know your startup mindset well and I carry with it with me too. I came into a legacy fintech company the same way and pushed for faster decision making processes.
I didn't realize the cause for conservatism until I was given a story about how the company needed to manually call and refund thousands of customers... all because one developer fucked up and double charged people.
When you deal with MONEY and you experience getting burned like that... you realize how mantra's like "move fast and break things" only work as convenient motto's for startups who have nothing to lose.
I'm starting to blog about some of the issues around these things. One I want to write about - especially because it triggers controversy when I say it - is the idea that it's more important to not be wrong than it is to be right. So no matter how obviously right a move is, fear that it just might be wrong hampers decisions.
Early in my career in the 90's I worked for BZW working on derivative hedging for financial swaps.
Things were pretty hectic and it was a pretty small team. We were using SQL Server at best and Excel spreadsheets as information feeds at worst, trying to calculate Black-Scholes on this stuff.
I clearly remember talking to the Traders and Quants regarding certain calculations we were doing to give them bond price goals for offsetting risk on the trades.
Honestly the traders at least didn't give a crap. I would present them tables; they would look at it, and would say "yeah that looks about right" - that's a direct quote from BZW's lead trader in 1994. I can't imagine things have changed that much.
I didn't stay long in that environment; it was pretty clear to me that despite getting people like Grady Booch coming in to clean up our act our "Customers" didn't really care too much about the mechanics of how things worked, or even worse whether the calculations were correct.
Top and bottom is, while the banking industry may employ "Cowboys" in the back-office for IT services, they also employ Cowboys in the front-office making the trades.
I doubt that has changed for the better that much. See 2008 financial crisis et al.
[Edit] As a side note, for the time I was earning more money than I knew what to do with. My boss at the time felt so ambivalent about his twice yearly 50k (sterling) bonus that he threw it away at the Casino the day he got it, (remember, this is 1994). I left and took a 75% pay cut to go work for Microsoft on projects that were at least form a CS point of view a lot more respectable. Having said that, I don't want to come off too harshly, we were at the cutting edge at the time and the technology was very cool and thoroughly enjoyable. But still..
Startups have yet to solve the problem of legacy code. At the very center of Internet companies too old to still be called startups is some ancient Perl or PHP written by the founders, back when they were around, and even wrote code. That might seem less archaic than COBOL or faxes but it's the same root problem. "If it ain't broke, don't fix it" didn't account for "well you see, it's not catastrophically broken but it's holding us back"; maintenance programming isn't sexy, like maintaining bridges and highways that have already been built, but just as necessary for continued operation.
Yes and no. Some of the more successful "startups" like Google and Facebook have slowly rewritten the bulk of their systems over time. In the case of Facebook in particular, there's very little if any of the early scary spaghetti PHP left.
Well, since I have also successfully managed similar such projects for financial institutions & telcos, and I live in startup land, by your last remark I feel moderately qualified to comment.
I'll challenge your view that folks in the startup world don't know enterprise. Maybe some visible fraction are the young and inexperienced hipsters as portrayed on HBO, sure, but most of those I know in CTO+ roles actually have a lot of enterprise under the belt. In my case in a B2B play it's practically mandatory, in order to understand the customer.
I believe enterprises suffer principally from the fear of change, or more bluntly, the fear of screwing up and being held accountable, which leads to the pathological technical debt issues you've described. So the problems I've always faced in enterprise projects are not primarily technological, but instead those of a) finding a full team of people competent enough and fearless enough to perform transplant surgery on the beating heart of a living body corporate and b) collecting sufficient clout to be allowed to perform the operation.
I reckon the best thing you can do, as a project leader in the enterprise world, is leave a legacy of constant and gradual change. Normalize frequent updates through CI/CD. Get business owners used to things like minor feature requests being included in daily deploys. No-one will thank you at the time, but a change in culture is almost certainly the most enduring value you can create.
So yeah, I'll happily roll my eyes at the "incompetence of the enterprise world", because I've dealt with the stupid head-on, and used techniques from startup land to innoculate it permanently.
Furthermore lots of decisions in enterprises look like the following:
An executive is approached by a vendor. The vendor entices the executive, shows them a good time, gives them a really good assurance their service is worth it.
An engineer hates this service because it sucks. Because the decision was made based on how cool it looks, not by technical needs.
Given that, it is hard for an executive to get approached by a vendor who says "okay, this will cost a crazy amount of time and money but we'll make your systems more modern" which sounds like "hey, I'm going to come in and offer to replace a system that has worked for 30 years with something modern and potentially risky. And it'll cost you a lot." The executive doesn't (usually) realize that the true cost is really high and goes up with time. And of course they don't want to lose their cushy job so hell no they won't take it. Also the executive isn't directly working with the engineers so he doesn't truly know if he can trust them.
I take it to mean they've never actually worked on a truly hard problem in their lives.
I certainly have some empathy for this view. On the other hand, the right time to have addressed actually fixing some of these problems is 25 years ago. The second best time is today (to steal/abuse a phrase). Enterprise organizations sometimes punt these things down the road with half-assed solutions. It's cheaper today and tomorrow maybe it will be someone else's problem, right? All the while the overall issue becomes worse.
It sucks sometimes to be at the bottom of a deep hole you dug yourself into without a ladder, but at the end of the day, it's your hole.
And you are right that sometimes it's just a hard problem. But you can always make those worse.
Sometimes, 25 years ago, a solution to the problem wasn't available. I worked on state-of-the-art systems from 25 years ago. We were writing homegrown streaming data protocols over raw sockets, parsed with lex and yacc. We didn't have ssh, we didn't have http, we didn't have xml (much less json). A world-class system from 25 years ago would be scrap today... how many bright young junior programmers today could update a lex/yacc parse stream, or handle socket programming or DOS HIMEM?
Improvement needs to be continuous. The ability to update individual parts of the system with minimal coupling is vital. But even keeping that as the system evolves is a challenge - and designing for it in advance leads to all sorts of unnecessary "just in case" abstractions in the code.
Keeping code alive and running for a generation is a whole different kind of challenge.
I agree on the continuous improvement - i wasn't suggesting you do this once and stop thinking about it.
But note, 25 years we didn't have the same solutions we might have today, but we had good solutions to lots of common problems. We certainly had solutions to "system is specified by a mixed bag of ASCII and inconsistent EBCDIC files, none alike, all specified 15 years ago. Which is at the heart of the problem OP posited. 25 years ago people were saying exactly the same thing about banking systems in COBOL that beat describes. Exactly. The batch processing OP discusses had already been out of vogue for a decade at least. We had good solutions for nearly all of these problems, what we didn't have was quick, cheap solutions.
Just for completeness: we had html. We'd had SGML for a decade (which begat HTML and later XML). We had reasonable streaming protocols. We were a lot worse at connecting heterogeneous systems controlled by different entities and interoperating, but we good at networking and building distributed systems at a smaller scale.
Keeping code alive and running for a generation is a very difficult problem, but keeping systems tidy, modular, and evolving is manageable, until you let them go too much.
And people are digging the same holes today. It's not technology that is the cause, it never was. It's cost and short term planning.
> how many bright young junior programmers today could update a lex/yacc parse stream, or handle socket programming or DOS HIMEM?
Almost all of them – if they're really "bright" anyways. Even given that a lot of the important context is missing, bright programmers can do this stuff.
that's disingenuous. anyone can do anything given enough to learn the skills. what OP is asking is how many of today's programmers have the skills already. the answer is very few.
Indeed. If someone suggested writing a custom stream parser for something as simple as scanned images today, I'd point them right back at the wide array of off-the-shelf, standardized solutions.
Sure, a good programmer can learn this stuff. But they shouldn't have to, not these days. There's far more to programming than any one person could ever learn. Choose your battles.
As my Director says: "Yes, we're solving problems and making code better, but never forget, this code has run the business for 10 years, so give it credit that it did do it's job."
At a certain point you reach a point where continuing to try to move elephants around is going to get you no where.
Death is a necessary component of change. In fact, renewal could not come without death.
Existing legacy systems bring with them assumptions about how things ought to work, and debt about expectations -- expectations that slow down your ability to change away from existing paradigms.
True innovation requires this breakaway.
So honestly, IMO the best move for a bank that is facing this kind of software nightmare is to maintain existing legacy support for the old system, but do a complete breakaway (NOT REWRITE) that is explicitly NOT dependent on the old contracts of functionality that the old system would have imposed. Make the rules change, acknowledge the old system will break with the existing system, and plan for a data migration over where ever possible.
Accepting defeat and moving on is a saner path. Migrating the data will become possible once it's realized that ultimately data is easier to change over than behaviour.
I say this too as someone who is very against rewrites generally. It's a fallacy to believe that old systems can accomodate new.
Looking back on that giant rewrite project, that's how I'd have done it... I'd have built the new system in parallel with the old one, not sharing the data store. The new system would have significant advantages over the old system (ie near-realtime transactions rather than waiting for overnight batch jobs). Get it running, and encourage early-adopter customers to switch over. That will stress-test it and allow it to scale. After a few years, with lots of warning, retire the old system.
That gets away from the "Flip a switch on billions of dollars of transactions a day" terror.
Next, there's a lot of business logic basically embedded in the COBOL, or even at lower layers. For example, lots of banking files are in EBCDIC, a different character set from ASCII. Except there are lots of different EBCDIC variants, and there's no good way to tell which one you're viewing just by looking at the file. So you have to reverse-engineer COBOL to figure out the "correct" meaning of a given file.
The problems go on and on. When I see people in the startup world rolling their eyes at the "incompetence" of the enterprise world, I take it to mean they've never actually worked on a truly hard problem in their lives.