Placide ursidé faisant fi de sa condition qui le pousse au végétalisme à base de bambusoïdées. Attention, ce flux peut contenir des traces de mauvais esprit.
49 stories

Paul Ford: What is Code?

6 Comments and 11 Shares


A computer is a clock with benefits. They all work the same, doing second-grade math, one step at a time: Tick, take a number and put it in box one. Tick, take another number, put it in box two. Tick, operate (an operation might be addition or subtraction) on those two numbers and put the resulting number in box one. Tick, check if the result is zero, and if it is, go to some other box and follow a new set of instructions.

You, using a pen and paper, can do anything a computer can; you just can’t do those things billions of times per second. And those billions of tiny operations add up. They can cause a phone to boop, elevate an elevator, or redirect a missile. That raw speed makes it possible to pull off not one but multiple sleights of hand, card tricks on top of card tricks. Take a bunch of pulses of light reflected from an optical disc, apply some math to unsqueeze them, and copy the resulting pile of expanded impulses into some memory cells—then read from those cells to paint light on the screen. Millions of pulses, 60 times a second. That’s how you make the rubes believe they’re watching a movie.

Apple has always made computers; Microsoft used to make only software (and occasional accessory hardware, such as mice and keyboards), but now it’s in the hardware business, with Xbox game consoles, Surface tablets, and Lumia phones. Facebook assembles its own computers for its massive data centers.

So many things are computers, or will be. That includes watches, cameras, air conditioners, cash registers, toilets, toys, airplanes, and movie projectors. Samsung makes computers that look like TVs, and Tesla makes computers with wheels and engines. Some things that aren’t yet computers—dental floss, flashlights—will fall eventually.

When you “batch” process a thousand images in Photoshop or sum numbers in Excel, you’re programming, at least a little. When you use computers too much—which is to say a typical amount—they start to change you. I’ve had Photoshop dreams, Visio dreams, spreadsheet dreams, and Web browser dreams. The dreamscape becomes fluid and can be sorted and restructured. I’ve had programming dreams where I move text around the screen.

You can make computers do wonderful things, but you need to understand their limits. They’re not all-powerful, not conscious in the least. They’re fast, but some parts—the processor, the RAM—are faster than others—like the hard drive or the network connection. Making them seem infinite takes a great deal of work from a lot of programmers and a lot of marketers.

The turn-of-last-century British artist William Morris once said you can’t have art without resistance in the materials. The computer and its multifarious peripherals are the materials. The code is the art.

2.1 How Do You Type an “A”?

Consider what happens when you strike a key on your keyboard. Say a lowercase “a.” The keyboard is waiting for you to press a key, or release one; it’s constantly scanning to see what keys are pressed down. Hitting the key sends a scancode.

Just as the keyboard is waiting for a key to be pressed, the computer is waiting for a signal from the keyboard. When one comes down the pike, the computer interprets it and passes it farther into its own interior. “Here’s what the keyboard just received—do with this what you will.”

It’s simple now, right? The computer just goes to some table, figures out that the signal corresponds to the letter “a,” and puts it on screen. Of course not—too easy. Computers are machines. They don’t know what a screen or an “a” are. To put the “a” on the screen, your computer has to pull the image of the “a” out of its memory as part of a font, an “a” made up of lines and circles. It has to take these lines and circles and render them in a little box of pixels in the part of its memory that manages the screen. So far we have at least three representations of one letter: the signal from the keyboard; the version in memory; and the lines-and-circles version sketched on the screen. We haven’t even considered how to store it, or what happens to the letters to the left and the right when you insert an “a” in the middle of a sentence. Or what “lines and circles” mean when reduced to binary data. There are surprisingly many ways to represent a simple “a.” It’s amazing any of it works at all.

Coders are people who are willing to work backward to that key press. It takes a certain temperament to page through standards documents, manuals, and documentation and read things like “data fields are transmitted least significant bit first” in the interest of understanding why, when you expected “ü,” you keep getting “�.”

2.2 From Hardware to Software

Hardware is a tricky business. For decades the work of integrating, building, and shipping computers was a way to build fortunes. But margins tightened. Look at Dell, now back in private hands, or Gateway, acquired by Acer. Dell and Gateway, two world-beating companies, stayed out of software, typically building PCs that came preinstalled with Microsoft Windows—plus various subscription-based services to increase profits.

This led to much cursing from individuals who’d spent $1,000 or more on a computer and now had to figure out how to stop the antivirus software from nagging them to pay up.

Steve Ballmer

Ballmer chants “Developers!”

Source: Youtube

Years ago, when Microsoft was king, Steve Ballmer, sweating through his blue button-down, jumped up and down in front of a stadium full of people and chanted, “Developers! Developers! Developers! Developers!”

He yelled until he was hoarse: “I love this company!” Of course he did. If you can sell the software, if you can light up the screen, you’re selling infinitely reproducible nothings. The margins on nothing are great—until other people start selling even cheaper nothings or giving them away. Which is what happened, as free software-based systems such as Linux began to nibble, then devour, the server market, and free-to-use Web-based applications such as Google Apps began to serve as viable replacements for desktop software.

Expectations around software have changed over time. IBM unbundled software from hardware in the 1960s and got to charge more; Microsoft rebundled Internet Explorer with Windows in 1998 and got sued; Apple initially refused anyone else the ability to write software for the iPhone when it came out in 2007, and then opened the App Store, which expanded into a vast commercial territory—and soon the world had Angry Birds. Today, much hardware comes with some software—a PC comes with an operating system, for example, and that OS includes hundreds of subprograms, from mail apps to solitaire. Then you download or buy more.

There have been countless attempts to make software easier to write, promising that you could code in plain English, or manipulate a set of icons, or make a list of rules—software development so simple that a bright senior executive or an average child could do it. Decades of efforts have gone into helping civilians write code as they might use a calculator or write an e-mail. Nothing yet has done away with developers, developers, developers, developers.

Thus a craft, and a professional class that lives that craft, emerged. Beginning in the 1950s, but catching fire in the 1980s, a proportionally small number of people became adept at inventing ways to satisfy basic human desires (know the time, schedule a flight, send a letter, kill a zombie) by controlling the machine. Coders, starting with concepts such as “signals from a keyboard” and “numbers in memory,” created infinitely reproducible units of digital execution that we call software, hoping to meet the needs of the marketplace. Man, did they. The systems they built are used to manage the global economic infrastructure. If coders don’t run the world, they run the things that run the world.

Most programmers aren’t working on building a widely recognized application like Microsoft Word. Software is everywhere. It’s gone from a craft of fragile, built-from-scratch custom projects to an industry of standardized parts, where coders absorb and improve upon the labors of their forebears (even if those forebears are one cubicle over). Software is there when you switch channels and your cable box shows you what else is on. You get money from an ATM—software. An elevator takes you up five stories—the same. Facebook releases software every day to something like a billion people, and that software runs inside Web browsers and mobile applications. Facebook looks like it’s just pictures of your mom’s crocuses or your son’s school play—but no, it’s software.

Photographer: Boru O’Brien O’Connell for Bloomberg Businessweek; Set design: Dave Bryant

2.3 How Does Code Become Software?

We know that a computer is a clock with benefits, and that software starts as code, but how?

We know that someone, somehow, enters a program into the computer and the program is made of code. In the old days, that meant putting holes in punch cards. Then you’d put the cards into a box and give them to an operator who would load them, and the computer would flip through the cards, identify where the holes were, and update parts of its memory, and then it would—OK, that’s a little too far back. Let’s talk about modern typing-into-a-keyboard code. It might look like this:

ispal: {x~|x}

That’s in a language called, simply, K, famous for its brevity. That code will test if something is a palindrome. If you next typed in ispal "able was i ere i saw elba", K will confirm that yes, this is a palindrome.

So how else might your code look? Maybe like so, in Excel (with all the formulas hidden away under the numbers they produce, and a check box that you can check):


But Excel spreadsheets are tricky, because they can hide all kinds of things under their numbers. This opacity causes risks. One study by a researcher at the University of Hawaii found that 88 percent of spreadsheets contain errors.

Programming can also look like Scratch, a language for kids:


That’s definitely programming right there—the computer is waiting for a click, for some input, just as it waits for you to type an “a,” and then it’s doing something repetitive, and it involves hilarious animals.

Or maybe:


That’s in Fortran. The reason it’s not working is that you forgot to put a quotation mark at the end of the first line. Try a little harder, thanks.

All of these things are coding of one kind or another, but the last bit is what most programmers would readily identify as code. A sequence of symbols (using typical keyboard characters, saved to a file of some kind) that someone typed in, or copied, or pasted from elsewhere. That doesn’t mean the other kinds of coding aren’t valid or won’t help you achieve your goals. Coding is a broad human activity, like sport, or writing. When software developers think of coding, most of them are thinking about lines of code in files. They’re handed a problem, think about the problem, write code that will solve the problem, and then expect the computer to turn word into deed.

Code is inert. How do you make it ert? You run software that transforms it into machine language. The word “language” is a little ambitious here, given that you can make a computing device with wood and marbles. Your goal is to turn your code into an explicit list of instructions that can be carried out by interconnected logic gates, thus turning your code into something that can be executed—software.

A compiler is software that takes the symbols you typed into a file and transforms them into lower-level instructions. Imagine a programming language called Business Operating Language United System, or Bolus. It’s a terrible language that will have to suffice for a few awkward paragraphs. It has one real command, PRINT. We want it to print HELLO NERDS on our screen. To that end, we write a line of code in a text file that says:


And we save that as nerds.bol. Now we run gnubolus nerds.bol, our imaginary compiler program. How does it start? The only way it can: by doing lexical analysis, going character by character, starting with the “p,” grouping characters into tokens, saving them into our one-dimensional tree boxes. Let’s be the computer.

Character Meaning
P Hmmmm...?
R Someone say something?
I I’m waiting...
N [drums fingers]
T Any time now...
Space Ah, "PRINT"
{ String coming!
H These
E letters
L don’t
L matter
O la
Space la
N just
E saving
R them
D for
S later
} Stringtime is over!
End of file Time to get to work.

The reason I’m showing it to you is so you can see how every character matters. Computers usually “understand” things by going character by character, bit by bit, transforming the code into other kinds of code as they go. The Bolus compiler now organizes the tokens into a little tree. Kind of like a sentence diagram. Except instead of nouns, verbs, and adjectives, the computer is looking for functions and arguments. Our program above, inside the computer, becomes this:

Trees are a really pleasant way of thinking of the world. Your memo at work has sections that have paragraphs? Tree. Your e-mail program contains messages that contain subject lines and addresses? Tree. Your favorite software program that has a menu bar with individual items that have subitems? Tree. Every day is Arbor Day in Codeville.

Of course, it’s all a trick. If you cut open a computer, you’ll find countless little boxes in rows, places where you can put and retrieve bytes. Everything ultimately has to get down to things in little boxes pointing to each other. That’s just how things work. So that tree is actually more like this:

Every character truly, truly matters. Every single stupid misplaced semicolon, space where you meant tab, bracket instead of a parenthesis—mistakes can leave the computer in a state of panic. The trees don’t know where to put their leaves. Their roots decay. The boxes don’t stack neatly. For not only are computers as dumb as a billion marbles, they’re also positively Stradivarian in their delicacy.

That process of going character by character can be wrapped up into a routine—also called a function, a method, a subroutine, or component. (Little in computing has a single, reliable name, which means everyone is always arguing over semantics.) And that routine can be run as often as you need. Second, you can print anything you wish, not just one phrase. Third, you can repeat the process forever, and nothing will stop you until the machine breaks or, barring that, heat death of the universe. Obviously no one besides Jack Nicholson in The Shining really needs to keep typing the same phrase over and over, and even then it turned out to be a bad idea.

Instead of worrying about where the words are stored in memory and having to go character by character, programming languages let you think of things like strings, arrays, and trees. That’s what programming gives you. You may look over a programmer’s shoulder and think the code looks complex and boring, but it’s covering up repetitive boredom that’s unimaginably vast.

This thing we just did with individual characters, compiling a program down into a fake assembly language so that the nonexistent computer can print each character one at a time? The same principle applies to every pixel on your screen, every frequency encoded in your MP3 files, and every imaginary cube in Minecraft. Computing treats human language as an arbitrary set of symbols in sequences. It treats music, imagery, and film that way, too.

It’s a good and healthy exercise to ponder what your computer is doing right now. Maybe you’re reading this on a laptop: What are the steps and layers between what you’re doing and the Lilliputian mechanisms within? When you double-click an icon to open a program such as a word processor, the computer must know where that program is on the disk. It has some sort of accounting process to do that. And then it loads that program into its memory—which means that it loads an enormous to-do list into its memory and starts to step through it. What does that list look like?

Maybe you’re reading this in print. No shame in that. In fact, thank you. The paper is the artifact of digital processes. Remember how we put that “a” on screen? See if you can get from some sleepy writer typing that letter on a keyboard in Brooklyn, N.Y., to the paper under your thumb. What framed that fearful symmetry?

Thinking this way will teach you two things about computers: One, there’s no magic, no matter how much it looks like there is. There’s just work to make things look like magic. And two, it’s crazy in there.


Photographer: Asger Carlsen for Bloomberg Businessweek; Set Design: Dave Bryant

2.4 What Is an Algorithm?

“Algorithm” is a word writers invoke to sound smart about technology. Journalists tend to talk about “Facebook’s algorithm” or a “Google algorithm,” which is usually inaccurate. They mean “software.”

Algorithms don’t require computers any more than geometry does. An algorithm solves a problem, and a great algorithm gets a name. Dijkstra’s algorithm, after the famed computer scientist Edsger Dijkstra, finds the shortest path in a graph. By the way, “graph” here doesn’t mean but rather

Think of a map; streets connect to streets at intersections. It’s a graph! There are graphs all around you. Plumbing, electricity, code compilation, social networks, the Internet, all can be represented as graphs! (Now to monetize …)

Many algorithms have their own pages on Wikipedia. You can spend days poking around them in wonder. Euclid’s algorithm, for example, is the go-to specimen that shows up whenever anyone wants to wax on about algorithms, so why buck the trend? It’s a simple way of determining the greatest common divisor for two numbers. Take two numbers, like 16 and 12. Divide the first by the second. If there’s a remainder (in this case there is, 4), divide the first, 16, by that remainder, 4, which gives you 4 and no remainder, so we’re done—and 4 is the greatest common divisor. (Now translate that into machine code, and we can get out of here.)

There’s a site called Rosetta Code that shows you different algorithms in different languages. The Euclid’s algorithm page is great. Some of the examples are suspiciously long and laborious, and some are tiny nonsense poetry, like this one, in the language Forth:

: gcd ( a b -- n )
  begin dup while tuck mod repeat drop ;

Read it out loud, preferably to friends. Forth is based on the concept of a stack, which is a special data structure. You make “words” that do things on the stack, building up a little language of your own. PostScript, the language of laser printers, came after Forth but is much like it. Look at how similar the code is, give or take some squiggles:

/gcd {
  {0 gt} {dup rup mod} {pop exit} ifte
} loop

And that’s Euclid’s algorithm in PostScript. I admit, this might be fun only for me. Here it is in Python (all credit to Rosetta Code):

def gcd(u, v):
    return gcd(v, u % v) if v else abs(u)

A programming language is a system for encoding, naming, and organizing algorithms for reuse and application. It’s an algorithm management system. This is why, despite the hype, it’s silly to say Facebook has an algorithm. An algorithm can be translated into a function, and that function can be called (run) when software is executed. There are algorithms that relate to image processing and for storing data efficiently and for rapidly running through the elements of a list. Most algorithms come for free, already built into a programming language, or are available, organized into libraries, for download from the Internet in a moment. You can do a ton of programming without actually thinking about algorithms—you can save something into a database or print a Web page by cutting and pasting code. But if you want the computer to, say, identify whether it’s reading Spanish or Italian, you’ll need to write a language-matching function. So in that sense, algorithms can be pure, mathematical entities as well as practical expressions of ideas on which you can place your grubby hands.

Dijkstra distributed a remarkable and challenging set of 18 memos to the global computer science community, starting in the 1960s and continuing up until his death in 2002, known as “EWDs,” many of them handwritten.

One thing that took me forever to understand is that computers aren’t actually “good at math.” They can be programmed to execute certain operations to certain degrees of precision, so much so that it looks like “doing math” to humans. Dijkstra said: “Computer science is no more about computers than astronomy is about telescopes.” A huge part of computer science is about understanding the efficiency of algorithms—how long they will take to run. Computers are fast, but they can get bogged down—for example, when trying to find the shortest path between two points on a large map. Companies such as Google, Facebook, and Twitter are built on top of fundamental computer science and pay great attention to efficiency, because their users do things (searches, status updates, tweets) an extraordinary number of times. Thus it’s absolutely worth their time to find excellent computer scientists, many with doctorates, who know where all the efficiencies are buried.

It takes a good mathematician to be a computer scientist, but a middling one to be an effective programmer. Until you start dealing with millions of people on a network or you need to blur or sharpen a million photos quickly, you can just use the work of other people. When it gets real, break out the comp sci. When you’re doing anything a hundred trillion times, nanosecond delays add up. Systems slow down, users get cranky, money burns by the barrel.

The hardest work in programming is getting around things that aren’t computable, in finding ways to break impossible tasks into small, possible components, and then creating the impression that the computer is doing something it actually isn’t, like having a human conversation. This used to be known as “artificial intelligence research,” but now it’s more likely to go under the name “machine learning” or “data mining.” When you speak to Siri or Cortana and they respond, it’s not because these services understand you; they convert your words into text, break that text into symbols, then match those symbols against the symbols in their database of terms, and produce an answer. Tons of algorithms, bundled up and applied, mean that computers can fake listening.

A programming language has at least two jobs, then. It needs to wrap up lots of algorithms so they can be reused. Then you don’t need to go looking for a square-root algorithm (or a genius programmer) every time you need a square root. And it has to make it easy for programmers to wrap up new algorithms and routines into functions for reuse. The DRY principle, for Don’t Repeat Yourself, is one of the colloquial tenets of programming. That is, you should name things once, do things once, create a function once, and let the computer repeat itself. This doesn’t always work. Programmers repeat themselves constantly. I’ve written certain kinds of code a hundred times. This is why DRY is a principle.

Enough talk. Let’s code!

2.5 The Sprint

After a few months the budget is freed up, and the Web re-architecture project is under way. They give it a name: Project Excelsior. Fine. TMitTB (who, to be fair, has other clothes and often dresses like he’s in Weezer) checks in with you every week.

He brings documents. Every document has its own name. The functional specification is a set of at least a thousand statements about users clicking buttons. “Upon accessing the Web page the user if logged in will be identified by name and welcomed and if not logged in will be encouraged to log in or create an account. (See user registration workflow.)”

God have mercy on our souls. From there it lists various error messages. It’s a sort of blueprint in that it describes—in words, with occasional diagrams—a program that doesn’t exist.

Some parts of the functional specification refer to “user stories,” tiny hypothetical narratives about people using the site, e.g., “As a visitor to the website, I want to search for products so I can quickly purchase what I want.”

Then there’s something TMitTB calls wireframe mock-ups, which are pictures of how the website will look, created in a program that makes everything seem as if it were sketched by hand, all a little squiggly—even though it was produced on a computer. This is so no one gets the wrong idea about these ideas-in-progress and takes them too seriously. Patronizing, but point taken.

You rarely see TMitTB in person, because he’s often at conferences where he presents on panels. He then tweets about the panels and notes them on his well-populated LinkedIn page. Often he takes a picture of the audience from the stage, and what you see is an assembly of mostly men, many with beards, the majority of whom seem to be peering into their laptop instead of up at the stage. Nonetheless the tweet that accompanies that photo says something like, “AMAZING audience! @ the panel on #microservice architecture at #ArchiCon2015.”

He often tells you just how important this panel-speaking is for purposes of recruiting. Who’s to say he is wrong? It costs as much to hire a senior programmer as it does to hire a midlevel executive, so maybe going to conferences is his job, and in the two months he’s been here he’s hired four people. His two most recent hires have been in Boston and Hungary, neither of which is a place where you have an office.

But what does it matter? Every day he does a 15-minute “standup” meeting via something called Slack, which is essentially like Google Chat but with some sort of plaid visual theme, and the programmers seem to agree that this is a wonderful and fruitful way to work.

“I watch the commits,” TMitTB says. Meaning that every day he reviews the code that his team writes to make sure that it’s well-organized. “No one is pushing to production without the tests passing. We’re good.”

Your meetings, by comparison, go for hours, with people arranged around a table—sitting down. You wonder how he gets his programmers to stand up, but then some of them already use standing desks. Perhaps that’s the ticket.

Honestly, you would like to go to conferences sometimes and be on panels. You could drink bottled water and hold forth just fine.

2.6 What’s With All These Conferences, Anyway?

Conferences! The website Lanyrd lists hundreds of technology conferences for June 2015. There’s an event for software testers in Chicago, a Twitter conference in São Paulo, and one on enterprise content management in Amsterdam. In New York alone there’s the Big Apple Scrum Day, the Razorfish Tech Summit, an entrepreneurship boot camp for veterans, a conference dedicated to digital mapping, many conferences for digital marketers, one dedicated to Node.js, one for Ruby, and one for Scala (these are programming languages), a couple of breakfasts, a conference for cascading style sheets, one for text analytics, and something called the Employee Engagement Awards.

Tech conferences look like you’d expect. Tons of people at a Sheraton, keynote in Ballroom D. Or enormous streams of people wandering through South by Southwest in Austin. People come together in the dozens or thousands and attend panels, ostensibly to learn; they attend presentations and brush up their skills, but there’s a secondary conference function, one of acculturation. You go to a technology conference to affirm your tribal identity, to transfer out of the throng of dilettantes and into the zone of the professional. You pick up swag and talk to vendors, if that’s your thing.

First row: TechCrunch Disrupt NYC, May 2011; Google I/O developers conference, San Francisco, May 2013; Global Mobile Internet Conference, Beijing, April 2015
Second row:Nvidia GPU, San Jose, September 2010; South by Southwest (SXSW) Interactive Festival, Austin, March 2013; Apple Worldwide Developers Conference (WWDC), San Francisco, June 2008
Third row: TechCrunch Disrupt NYC, May 2012; Re:publica conference, Berlin, May 2015; TechCrunch Disrupt NYC, May 2015
Fourth row: SXSW Interactive Festival, Austin, March 2014; WWDC, San Francisco, June 2015; Bloomberg Technology Conference!, San Francisco, June 15-16

Technology conferences are where primate dynamics can be fully displayed, where relationships of power and hierarchy can be established. There are keynote speakers—often the people who created the technology at hand or crafted a given language. There are the regular speakers, often paid not at all or in airfare, who present some idea or technique or approach. Then there are the panels, where a group of people are lined up in a row and forced into some semblance of interaction while the audience checks its e-mail.

I’m a little down on panels. They tend to drift. I’m not sure why they exist.

Here’s the other thing about technology conferences: There has been much sexual harassment and much sexist content in conferences. Which is stupid, because computers are dumb rocks lacking genitalia, but there you have it.

Women in software, having had enough, started to write it up, post to blogs. Other women did the same. The problem is pervasive: There are a lot of conferences, and there have been many reports of harassing behavior. The language Ruby, the preferred language for startup bros, developed the worst reputation. At a Ruby conference in 2009, someone gave a talk subtitled “Perform Like a Pr0n Star,” with sexy slides. That was dispiriting. There have been criminal incidents, too.

Conferences began to develop codes of conduct, rules and algorithms for people (men, really) to follow.

If you are subject to or witness unacceptable behavior, or have any other concerns, please notify a community organizer as soon as possible … 

Burlington Ruby Conference

php[architect] is dedicated to providing a harassment-free event experience for everyone and will not tolerate harassment or offensive behavior in any form.

The Atlanta Java Users Group (AJUG) is dedicated to providing an outstanding conference experience for all attendees, speakers, sponsors, volunteers, and organizers involved in DevNexus (GeekyNerds) regardless of gender, sexual orientation, disability, physical appearance, body size, race, religion, financial status, hair color (or hair amount), platform preference, or text editor of choice.

When people started talking about conference behavior, they also began to talk about the larger problems of programming culture. This was always an issue, but the conference issues gave people a point of common reference. Why were there so many men in this field? Why do they behave so strangely? Why is it so hard for them to be in groups with female programmers and behave in a typical, adult way?

“I go to work and I stick out like a sore thumb. I have been mistaken for an administrative assistant more than once. I have been asked if I was physical security (despite security wearing very distinctive uniforms),” wrote a black woman on <a href="" rel="nofollow"></a> who has worked, among other places, at Google.

Famous women in coding history

Ada Lovelace: The first programmer. She devised algorithms for Charles Babbage’s “analytical engine,” which he never built.

Ada Lovelace

Grace Murray Hopper: World War II hero and inventor of the compiler.

Grace Murray

“Always the only woman in the meeting, often the first—the first female R&D engineer, first female project lead, first female software team lead—in the companies I worked for,” wrote another woman in Fast Company magazine.

Fewer than a fifth of undergraduate degrees in computer science awarded in 2012 went to women, according to the National Center for Women & Information Technology. Less than 30 percent of the people in computing are women. And the number of women in computing has fallen since the 1980s, even as the market for their skills has expanded. The pipeline is a huge problem. And yet it’s not unsolvable. I’ve met managers who have built perfectly functional large teams that are more than half female coders. Places such as the handicrafts e-commerce site Etsy have made a particular effort to develop educational programs and mentorship programs. Organizations such as the not-for-profit Girl Develop It teach women, and just women, how to create software.

It’s all happening very late in the boom, though. In 2014 some companies began to release diversity reports for their programming teams. It wasn’t a popular practice, but it was revealing. Intel is 23 percent female; Yahoo! is 37 percent. Apple, Facebook, Google, Twitter, and Microsoft are all around 30 percent. These numbers are for the whole companies, not only programmers. That’s a lot of women who didn’t get stock options. The numbers of people who aren’t white or Asian are worse yet. Apple just gave $50 million to fund diversity initiatives, equivalent to 0.007 percent of its market cap. Intel has a $300 million diversity project.

The average programmer is moderately diligent, capable of basic mathematics, has a working knowledge of one or more programming languages, and can communicate what he or she is doing to management and his or her peers. Given that a significant number of women work as journalists and editors, perform surgery, run companies, manage small businesses, and use spreadsheets, that a few even serve on the Supreme Court, and that we are no longer surprised to find women working as accountants, professors, statisticians, or project managers, it’s hard to imagine that they can’t write JavaScript. Programming, despite the hype and the self-serving fantasies of programmers the world over, isn’t the most intellectually demanding task imaginable.

Which leads one to the inescapable conclusion: The problem with women in technology isn’t the women.

Read the whole story
3289 days ago
A must read.

(Caution: it's such a long LONG read it’s truncated here)
3289 days ago
For anybody who's wondering, shared stories that come from the bookmark let have a max length. This is also the first instance that I'm aware of where a bookmark let share made it to popular.
3289 days ago
@samuel: I was curious about that – that was long enough to have some interesting highlighting issues when the bookmarklet loaded
3289 days ago
Share this story
4 public comments
3289 days ago
Fantastic and worth every minute
3290 days ago
Brooklyn, NY
3290 days ago
Wow. I won't pretend I've read it all but so far so good.
3290 days ago
Yeah, it's a serious get-a-cup-of-coffee-and-maybe-lunch read
3291 days ago
Paul Ford putting the long in long-read…
Washington, DC

Operating Systems

3 Comments and 14 Shares
One of the survivors, poking around in the ruins with the point of a spear, uncovers a singed photo of Richard Stallman. They stare in silence. "This," one of them finally says, "This is a man who BELIEVED in something."
Read the whole story
3357 days ago
Share this story
3 public comments
3357 days ago
"dos, but ironically" isn't until 2030? hmph.
Earth, Sol system, Western spiral arm
3355 days ago
We did have ironical dos on Lumia's just last week.
3355 days ago
I wonder where is Plan9. Oh hmph its running OS in authors house
3357 days ago
Heh. Love the alt text.
Atlanta, GA
3338 days ago
alt txt: "One of the survivors, poking around in the ruins with the point of a spear, uncovers a singed photo of Richard Stallman. They stare in silence. "This," one of them finally says, "This is a man who BELIEVED in something."
3357 days ago
One of the survivors, poking around in the ruins with the point of a spear, uncovers a singed photo of Richard Stallman. They stare in silence. "This," one of them finally says, "This is a man who BELIEVED in something."

Lets review.. Docker (again) | Cal Leeming Blog

1 Comment


It's been just over a year since my last review of Docker, heavily criticising it's flawed architectural design and poor user experience. The project has since matured into 1.0 and gained some notoriety from Amazon, but has suffered growing user frustration, hype accusations and even breakout exploits leading to host contamination. However the introduction of private repos in Docker Hub, which eliminated the need to run your own registry for hosted deployments, coupled with webhooks and tight Github build integrations, looked to be a promising start.

So I decided to give Docker another chance and put it into production for 6 months. The result was an absolute shit show of abysmal performance, hacky workarounds and rage inducing user experience which left me wanting to smash my face into the desk. Indeed performance was so bad, that disabling caching features actually resulted in faster build times.

(See reddit and hackernews thread for discussion, credits to CYPHERDEN for the cover image, taken from PewDiePie)


Dockerfile has numerous problems, it's ugly, restrictive, contradictory and fundamentally flawed. Lets say you want to build multiple images of a single repo, for example a second image which contains debugging tools, but both using the same base requirements. Docker does not support this (per #9198), there is no ability to extend a Dockerfile (per #735), using sub directories will break build context and prevent you using ADD/COPY (per #2224), as would piping (per #2112), and you cannot use env vars at build time to conditionally change instructions (per #2637).

Our hacky workaround was to create a base image, two environment specific images and some Makefile automation which involved renaming and sed replacement. There are also some unexpected "features" which lead to env $HOME disappearing, resulting in unhelpful error messages. Absolutely disgusting.

Docker cache/layers

Docker has the ability to cache Dockerfile instructions by using COW (copy-on-write) filesystems, similar to that of LVM snapshots, and until recently only supported AuFS, which has numerous problems. Then in release 0.7 different COW implementations were introduced to improve stability and performance, which you can read about in detail here.

However this caching system is unintelligent, resulting in some surprising side effects with no ability to prevent a single instruction from caching (per #1996). It's also painfully slow, to the point that builds will be faster if you disable caching and avoid using layers. This is exacerbated by slow upload/download speeds performance in Docker Hub, detailed further down.

These problems are caused by the poor architectural design of Docker as a whole, enforcing linear instruction execution even in situations where it is entirely inappropriate (per #2439). As a workaround for slow builds, you can use a third party tool which supports asynchronous execution, such as Salt Stack, Puppet or even bash, completely defeating the purpose of layers and making them useless.

Docker Hub

Docker encourages social collaboration via Docker Hub which allows Dockerfiles to be published, both public and private, which can later be extended and used by other users via FROM instruction, rather than copy/pasting. This ecosystem is akin to AMIs in AWS marketplace and Vagrant boxes, which in principle are very useful.

However the Docker Hub implementation is flawed for several reasons. Dockerfile does not support multiple FROM instructions (per #3378, #5714 and #5726), meaning you can only inherit from a single image. It also has no version enforcement, for example the author of dockerfile/ubuntu:14.04 could replace the contents of that tag, which is the equivalent of using a package manager without enforcing versions. And as mentioned later in the article, it has frustratingly slow speed restrictions.

Docker Hub also has an automated build system which detects new commits in your repository and triggers a container build. It is also completely useless for many reasons. Build configuration is restrictive with little to no ability for customisation, missing even the basics of pre/post script hooks. It enforces a specific project structure, expecting a single Dockerfile in the project root, which breaks our previously mentioned build workarounds, and build times were horribly slow.

Our workaround was to use CircleCI, an exceptional hosted CI platform, which triggered Docker builds from Makefile and pushed up to Docker Hub. This did not solve the problem of slow speeds, but the only alternative was to use our own Docker Registry, which is ridiculously complex.


Docker originally used LXC as their default execution environment, but now use their libcontainer by default as of 0.9. This introduced the ability to tweak namespace capabilities, privileges and, use customised LXC configs when using the appropriate exec-driver.

It requires a root daemon be running at all times on the host, and there have been numerous security vulnerabilities in Docker, for example CVE-2014-6407 and CVE-2014-6408 which, quite frankly, should not have existed in the first place. Even Gartner, with their track record for poor assessments, expressed concern over the immaturity of Docker and the security implications.

Docker, by design, puts ultimate trust in namespace capabilities which expose a much larger attack surface than a typical hypervisor, with Xen having 129 CVEs in comparison with the 1279 in Linux. This can be acceptable in some situations, for example public builds in Travis CI, but are dangerous in private, multi user environments.

Containers are not VMs

Namespaces and cgroups are beautifully powerful, allowing a process and its children to have a private view of shared kernel resources, such as the network stack and process table. This fine-grain control and segregation, coupled with chroot jailing and grsec, can provide an excellent layer of protection. Some applications, for example uWSGI, take direct advantage of these features without Docker, and applications which don't support namespaces directly can be sandboxed using firejail. If you're feeling adventurous, you can support directly into your code

Containerisation projects, such as LXC and Docker, take advantage of these features to effectively run multiple distros inside the same kernel space. In comparison with hypervisors, this can sometimes have the advantage of lower memory usage and faster startup times, but at the cost of reduced security, stability and compatibility. One horrible edge case relates to Linux Kernel Interfaces, running incompatible or untested combinations of glibc versions in kernel and userspace, resulting in unexpected behavior.

Back in 2008 when LXC was conceived, hardware assisted virtualisation had only been around for a couple of years, many hypervisors had performance and stability issues, as such virtualisation was not a widely used technology and these were acceptable tradeoffs to keep costs low and reduce physical footprint. However we have now reached the point where hypervisor performance is almost as fast as bare metal and, interestingly, faster in some cases. Hosted on-demand VMs are also becoming faster and cheaper, with DigitalOcean massively outperforming EC2 in both performance and cost, making it financially viable to have a 1:1 mapping of applications to VMs.

[edit] As pointed out by Bryan Cantrill, virtualisation performance will vary depending on workload types, for example IO heavy applications can result in slower performance.

There are some specific use cases in which containerisation is the correct approach, but unless you can explain precisely why in your use case, then you should probably be using a hypervisor instead. Even if you're using virtualisation you should still be taking advantage of namespaces, and tools such as firejail can help when your application lacks native support for these features.

Docker is unnecessary

Docker adds an intrusive layer of complexity which makes development, troubleshooting and debugging frustratingly difficult, often creating more problems than it solves. It doesn't have any benefits over deployment, because you still need to utilise snapshots to achieve responsive auto scaling. Even worse, if you're not using snapshots then your production environment scaling is dependant on the stability of Docker Hub.

It is already being abused by projects such as baseimage-docker, an image which intends to make inspection, debugging and compatibility easier by running init.d as its entry point and even giving you an optional SSH server, effectively treating the container like a VM, although the authors reject this notion with a poor argument.


If your development workflow is sane, then you will already understand that Docker is unnecessary. All of the features which it claims to be helpful are either useless or poorly implemented, and it's primary benefits can be easily achieved using namespaces directly. Docker would have been a cute idea 8 years ago, but it's pretty much useless today.


On the surface, Docker has a lot going for it. It's ecosystem is encouraging developers towards a mindset of immutable deployments, and starting new projects can be done quickly and easily, something which many people find useful. However it's important to note that this article focuses on the daily, long term usage of Docker, both locally and in production.

Although most of the problems mentioned are self explanatory, this post makes no effort to explain how Docker could do it better. There are many alternative solutions to Docker, each with their own pros/cons, and I'll be explaining these in detail on a follow up post. If you expect anything positive from Docker, or its maintainers, then you're shit outta luck.

There is a sub discussion by a-ko discussing long term impacts of containerisation, and a detailed technical rebuttal by markbnj, both of which you may find quite useful.

I'd like to say thank you to everyone who took time to give their feedback. It's fantastic to see so many people enjoy my style of writing, and reading responses from several high profile engineers, including those who have inspired me for many years, has been very humbling.

Response from Docker founder Solomon Hykes

And not a single fuck was given that day... pastebin link here.

Read the whole story
3371 days ago
I'm still confused and dubious about Docker. Guess I'm not the only one despite the hype surrounding the project.
Share this story

The sad state of sysadmin in the age of containers

1 Comment

System administration is in a sad state. It in a mess.

I'm not complaining about old-school sysadmins. They know how to keep systems running, manage update and upgrade paths.

This rant is about containers, prebuilt VMs, and the incredible mess they cause because their concept lacks notions of "trust" and "upgrades".

Consider for example Hadoop. Nobody seems to know how to build Hadoop from scratch. It's an incredible mess of dependencies, version requirements and build tools.

None of these "fancy" tools still builds by a traditional make command. Every tool has to come up with their own, incomptaible, and non-portable "method of the day" of building.

And since nobody is still able to compile things from scratch, everybody just downloads precompiled binaries from random websites. Often without any authentication or signature.

NSA and virus heaven. You don't need to exploit any security hole anymore. Just make an "app" or "VM" or "Docker" image, and have people load your malicious binary to their network.


Hadoop Wiki Page

of Debian is a typical example. Essentially, people have given up in 2010 to be able build Hadoop from source for Debian and offer nice packages.

To build Apache Bigtop, you apparently first have to install puppet3. Let it download magic data from the internet. Then it tries to run sudo puppet to enable the NSA backdoors (for example, it will download and install an outdated precompiled JDK, because it considers you too stupid to install Java.) And then hope the gradle build doesn't throw a 200 line useless backtrace.

I am not joking. It will try to execute commands such as e.g.

/bin/bash -c "wget <a href="" rel="nofollow"></a> ; dpkg -x ./scala-2.10.3.deb /"

Note that it doesn't even


the package properly, but extracts it to your root directory. The download does not check any signature, not even SSL certificates. (Source:

Bigtop puppet manifests


Even if your build would work, it will involve Maven downloading unsigned binary code from the internet, and use that for building.

Instead of writing clean, modular architecture, everything these days morphs into a huge mess of interlocked dependencies. Last I checked, the Hadoop classpath was already over 100 jars. I bet it is now 150, without even using any of the HBaseGiraphFlumeCrunchPigHiveMahoutSolrSparkElasticsearch (or any other of the Apache chaos) mess yet.

Stack is the new term for "I have no idea what I'm actually using".

Maven, ivy and sbt are the go-to tools for having your system download unsigned binary data from the internet and run it on your computer.

And with containers, this mess gets even worse.

Ever tried to security update a container?

Essentially, the Docker approach boils down to downloading an unsigned binary, running it, and hoping it doesn't contain any backdoor into your companies network.

Feels like downloading Windows shareware in the 90s to me.

When will the first docker image appear which contains the Ask toolbar? The first internet worm spreading via flawed docker images?

Back then, years ago, Linux distributions were trying to provide you with a safe operating system. With signed packages, built from a web of trust. Some even work on reproducible builds.

But then, everything got Windows-ized. "Apps" were the rage, which you download and run, without being concerned about security, or the ability to upgrade the application to the next version. Because "you only live once".

Update: it was pointed out that this started way before Docker: »Docker is the new 'curl | sudo bash'«. That's right, but it's now pretty much mainstream to download and run untrusted software in your "datacenter". That is bad, really bad. Before, admins would try hard to prevent security holes, now they call themselves "devops" and happily introduce them to the network themselves!

Read the whole story
3371 days ago
There is some truth is this rant.
Share this story

Politics Is Poisoning NASA’s Ability to Do What It Needs to Do

1 Share

Well, I told you so.

When Sen. Ted Cruz, R-Texas, Ted Cruz (R-Texas) was made head of the Senate committee in charge of NASA’s funding, I (and many others) were appalled. Cruz is a science denier, flatly claiming global warming isn’t happening.

This is an issue, since many of NASA’s missions are directly focused on examining the amount, extent, and impact of that warming. And rightly so.

While Cruz may not be able to directly impact NASA’s budget, he can certainly make things difficult on the agency agency, and pressure others to change NASA’s emphasis. He made this very clear last week when he held a meeting with NASA’s Administrator Charles Bolden as a witness. Cruz opened the session asking Bolden about NASA’s core mission, a clear shot at the idea that they should be looking outwards, not down.

Throughout the session, Cruz downplayed Earth science, claiming that NASA has lost focus on exploring space. It’s clear everything he was saying came from his stance of global warming denial.

And that is utter nonsense, to be incredibly polite. Pure and simple.

Bolden shot back, saying, “We can't go anywhere if the Kennedy Space Center goes underwater and we don't know it—and it — and that's understanding our environment.” In other words, we must study the Earth and its changing climate. Studying our planet is at least as important as studying others.

Second, as Bolden also points out, NASA has been gearing up for doing more human exploration for some time now.* now*. While I am not a fan of the Space Launch System rocket, it will certainly be able to lift a lot of payload into orbit and beyond (though at huge expense). And SpaceX is working on the Falcon Heavy, which will launch well before SLS gets off the ground ground, and will also be capable of heavy lifting. Its first demo launch will be in just a few months.

Over the years, NASA has had to beg and scrape to get the relatively small amount of money it gets—less than half a percent of the national budget—and still manages to do great things with it. Cruz is worried NASA’s focus needs to be more on space exploration. Fine. Then give them enough money to do everything in their charter: Explore space, send humans there, and study our planet. Whether you think climate change is real or not—and it is— telling NASA they should turn a blind eye to the environment of our own planet is insanity.

Bear in mind, too, Cruz has his sights set on the White House. That’s where NASA’s budget starts. Under a Cruz administration, NASA’s Earth Sciences program would be screwed.

There’s more. A few days before Cruz held his session, the House Subcommittee for Commerce, Justice, Science, and Related Agencies (which has NASA in its jurisdiction) also held a meeting with Bolden as a witness. The chairman, John Culberson, R-Texas, Chairman, John Culberson (R-Texas), is a friend of NASA; he was the one who fought for more money in NASA’s budget for a mission to Europa.

But even he holds some mistaken ideas about the agency. Right now, we depend on the Russians for access to the International Space Station, and given Russia’s current volatility (to say the least), Culberson asked Bolden what contingency plans NASA has if Russia decides to pull out.

Bolden said the only contingency we have is commercial flight to get humans into space. Culberson took issue with that:

Bolden: Had we gotten the funding that was requested when I first became the NASA administrator, we would have been all joyously going down to the Kennedy Space Center later this year to watch the first launch of some commercial spacecraft with our crew members on it. That day passed. And I came to this committee and I said over and over, if we don’t fund commercial crew. … crew….
Culberson: Had NASA not canceled the Constellation program we’d be ready to fly within 12 months.
Bolden: Mr. Chairman that is not correct … correct… whoever told you that, that is not correct.

Hearing Culberson say that makes me grind my teeth. The Constellation rocket system was way behind schedule and well over budget, and that’s why President Obama canceled cancelled it, correctly in my opinion. If we had kept it going I’d bet we still wouldn’t be able to put people into space today. At least not without huge impact to NASA’s other capabilities, due to its fixed budget.

And Bolden is right. Over the years, the president’s President’s NASA budget request for commercial flight has been slashed by Congress over and again (in FY 2012 it was cut by more than over 50 percent). If that money had instead gotten to NASA, we might very well already be celebrating the launch of Americans into space by an American rocket. Instead, here we are, dependent on the Russians.

Watching Congress grill NASA over what is Congress’ Congress’s fault is frustrating to say the least.

I have issues with the president’s President’s requests for NASA as well, and I’ve been vocal about them. But on the balance, it’s been Congress that has been slowly squeezing the life out of NASA’s ability to return to human spaceflight. And the shenanigans there still continue, since there has been a lot of political tomfoolery involving SLS, especially when it comes to SpaceX. I suggest Rep. Culberson talk to his colleagues about that before complaining to NASA that they can’t do what they’ve been mandated to do.

Look. NASA is the world’s premier space agency. Yes, I am an American, and yes, I say that with pride. Certainly, the European Space Agency is doing fantastic things things, and will continue to do so, but NASA has done more, gone farther, and been more a source of inspiration than any other.

But the politics of funding a government agency is tying NASA in knots and critically endangering its ability to explore.

At one point in his meeting, Rep. Culberson said, “Everything NASA does is just pure good.” That’s a nice sentiment. It would even better if Congress and the White House would let them do it.

*Planetary missions are in trouble, though.

My thanks to NASA press secretary Press Secretary Lauren Worley for the budget numbers pertaining to commercial space flight.

Read the whole story
3377 days ago
Share this story

Some notes on SuperFish

1 Comment

What's the big deal?

Lenovo, a huge maker of laptops, bundles software on laptops for the consumermarket (it doesn't for businesslaptops). Much of this software is from vendors who pay Lenovo to be included. Such software is usually limited versions, hoping users will pay to upgrade. Other software is add supported. Some software, such as the notorious " Toolbar", hijacks the browser to display advertisements.

Such software is usually bad, especially the ad-supported software, but the SuperFish software is particularly bad. It's designed to intercept all encrypted connections, things is shouldn't be able to see. It does this in a poor way that it leaves the system open to hackers or NSA-style spies.
For example, it can spy on your private bank connections, as shown in this picture.

Marc Rogers has
a post where he points out that what the software does is hijack your connections, monitors them, collects personal information, injects advertising into legitimate pages, and causes popup advertisement.

Who discovered this mess?

People had noticed the malware before, but it's Chris Palmer (@fugueish) that noticed the implications. He's a security engineer for Google who just bought a new Lenovo laptop, and noticed how it was Man-in-the-Middling his Bank of America connection. He spread the word the rest of the security community, who immediately recognized how bad this is.

What's the technical detail?

It does two things. The first is that SuperFish installs a transparent-proxy (MitM) service on the computer intercepting browser connections.
It appears to be based on Komodia's "SSL Digestor", described in detail here.

I don't know the details of exactly how they do this, but Windows provides easy hooks for such interception.
But such interception still cannot decrypt SSL. Therefore, SuperFish installs it's own rootCA certificate in Windows system. It then generates certificates on the fly for each attempted SSL connection. Thus, when you have a Lenovo computer, it appears as SuperFish is the root CA of all the websites you visit. This allows SuperFish to intercept an encrypted SSL connection, decrypt it, then re-encrypt it again.

Only the traffic from the browser to the SuperFish internal proxy uses the website's certificate. The traffic on the Internet still uses the normal website's certificate, so we can't tell if a machine is infected by SuperFish by looking at this traffic. However, SuperFish makes queries to additional webpages to download it's JavaScript, which may be detectable.

SuperFish's advertising works by injecting JavaScript code into web-pages. This is known to cause a lot of problems on websites.

It's the same root CA private-key for every computer. This means that hackers at your local cafe WiFi hotspot, or the NSA eavesdropping on the Internet, can use that private-key to likewise intercept all SSL connections from SuperFish users.

SuperFish is "adware" or "malware"

The company claims it's providing a useful service, helping users do price comparisons. This is false. It's really adware. They don't even offer the software for download from their own website. It's hard Googling for the software if you want a copy because your search results will be filled with help on removing it. The majority of companies that track adware label this as adware.

Their business comes from earning money from those adds, and it pays companies (like Lenovo) to bundle the software against a user's will. They rely upon the fact that unsophisticated users don't know how to get rid of it, and will therefore endure the adds.

Lenovo's response

Lenovo's response is here. They have stopped including the software on new systems.

However, they still defend the concept of the software, claiming it's helpful and wanted by users, when it's clear to everyone else that most users do not want this software.

It's been going on since at least June 2014

The earliest forum posting is from June of 2014. However, other people report that it's not installed on their mid-2013 Lenovo laptops.

Here is a post from September 2014.

It's legal

According to Lenovo, users are presented with a license agreement to accept the first time they load the browser. Thus, they have accepted this software, and it's not a "hacker virus". 

But this has long been the dodge of most adware. That users don't know what they agree to has long been known to be a problem. While it may be legal, just because users agreed to it doesn't mean it isn't bad.

Firefox is affected differently isn't affected

Internet Explorer and Chrome use the Windows default certificates. Firefox has it's own separate root certificates. Therefore, the tool apparently updates the Firefox certificate file separately.

Update:Following reports on the Internet, I said Firefox wasn't affected. This tweet corrected me.
Firefox users are not affected by this.

Uninstalling SuperFish leaves behind the root certificate

The problem persists even if the user uninstalls the software. They have to go into the Windows system and remove the certificate manually.

Other adware software does similar things

This post lists other software that does similar things.

How to uninstall the software?

First, run the "uninstall.exe" program that comes from the software. One way is from the installed program list on Windows. Another way is to go to "C:\Program Files (x86)\Lenovo\VisualDiscovery\uninstall.exe".

This removes the bad software, but the certificate is still left behind.

For Internet Explorer and Chrome, click on the "Start" menu and select "Run". Run the program "certmgr.msc". Select the "Trusted Root Certificate Authorities", and scroll down to "Superfish, Inc." and delete it (right click and select "Delete", or select and hit the delete key).

For Firefox, click on the main menu "Options", "Advanced", "Certificates". The Certificate Manager pops up. Scroll down, select "Superfish, Inc.", then "Delete or Detrust". This is shown in the image below.

How can you tell if you're vulnerable? or if the removal worked?

Some sites test for you, like

Do all machines have the same root certificate?

The EFF SSL observatory suggests yes, all the infected machines have the same root certificate. This means they can all be attacked. If they all had different certificates, then they wouldn't be attackable.

Read the whole story
3402 days ago
Share this story
Next Page of Stories