86 stories

A few sketches

3 Comments and 5 Shares

I've been making a few more sketches about linux debugging tools / opinions in the last week. You can find them in this public Dropbox folder if you're interested. Here's one of them:

Read the whole story
35 days ago
Zagreb, Croatia
Share this story
3 public comments
36 days ago
Teach yourself
37 days ago
My recent re-forays into programming have taught me that it's an attitude more than anything else--a philosophy--so this seems right on the mark.
37 days ago
I think so.
Cary, NC

A workshop on strace & tcpdump

1 Share

This week at work, I ran a workshop on tcpdump and strace. a couple of people on Twitter asked about it so here are some notes. This is mostly just so I can reuse them more easily next time, but maybe you will also find it interesting. The notes are a bit sparse and it's very unclear that anybody other than me will find them legible or useful, but I decided to put them here anyway.

I basically did a bunch of live demos of how to use tcpdump & strace, and then took questions & comments as people had them. I ran it in an hour, which I think was fine for people who already had some familiarity with the tools, but really aggressive if you're learning from scratch. Will do that differently next time.


Why use tcpdump? We do a lot of web development; almost all of our services talk to each other with network requests. So a tool that can spy on network requests can be a really good general-purpose debugging tool. I've used this quite a bit at work and it's been great.

I didn't really explain what TCP was, which seemed okay.

Ask everyone to install Wireshark at the beginning of the workshop. Wireshark is really easy to install on OS X now! Nobody had trouble with this.

step 1: laptop demo

  • start python -mSimpleHTTPServer 8080
  • run curl localhost:8080/404
  • now that we have some simple network traffic on my laptop, we can look at it with tcpdump!
  • run sudo tcpdump by itself. That's a lot of output, and it's a little difficult to read. Don't worry!
  • talk about network interfaces, here we actually need to run tcpdump -i lo0 or tcpdump -i any to see the local traffic
  • talk a little bit about what the output here means, but note that it's kind of difficult to understand
  • run tcpdump -A port 8080 -i any. port 8080 is the same as src port 8080 or dst port 8080; a really great shortcut. the pcap filter syntax can be a little difficult to remember at first, all I've need to really know so far is how to filter by port and IP. -A shows us the contents of the packets. We can see the GET request, the content type, and the response in tcpdump's output! This is really cool!
  • point out that Python's SimpleHTTPServer is writing out each header line in a separate packet, and that this makes no sense.

step 2: look at some QA network traffic

  • ssh to a dev machine
  • run tcpdump on some relevant network traffic, talk about what that network traffic actually means and how you can tell (by looking at the port, knowing what services run on which ports, looking at the hostname the packets are going to)

step 3: tcpdumping for performance

this section is about how to debug performance problems using tcpdump!

  • find a service in production that makes HTTP requests
  • talk about packet captures in production (it's generally safe to do, just be careful to not accidentally fill up your disk if you're trying to do packet capture on video streaming or something, and be aware that there may be customer data in there)
  • "now we're going to get timing information for those HTTP requests!"
  • run tcpdump port $correct_port -w output.pcap
  • press ctrl+c when you feel like you have enough data
  • "pcap files are like chocolate cookies -- every packet analysis tool understands how to read them. So you can write the pcap file on the server and then copy it over for offline analysis"
  • copy the file over to my laptop, get everyone else to copy over the file as well
  • "as you've seen, tcpdump output is a little difficult to interpret and search. some people are really good at that but IMO it's easier to stick with a basic knowledge of tcpdump and do more advanced stuff in Wireshark"
  • open up the pcap file in Wireshark.

Wireshark features to demo:

  • searching by HTTP status (try THAT with tcpdump!)
  • right clicking on a packet to get every other packet in that TCP conversation
  • click statistics -> conversations in the menu and you can sort all TCP sessions by duration to find performance problems
  • you can colour packets by type (TCP/UDP/whatever) to more easily visually see what's going on
  • show where the packet timings show up, and talk through how you can use this to diagnose whether the client or the server is the problem (if the client sends a packet and then the server takes 5 seconds to reply to it, that's AWESOME EVIDENCE)
  • ask other people in the room with experience for their favorite Wireshark features and tactics to get information about it. people had really great suggestions.

that's all for tcpdump!


step 1: system calls & how to read strace output

  • talk about what a system call is "the API for your operating system"
  • run strace on a small program like ls
  • talk through what the parts of the output mean ("this is the system call, this is a file descriptor, these are the arguments")
  • explain what happens in strace ls (first you have execve(ls), then the dynamic linker happens, the first part of the strace output is always the same, then you see stuff that's more specific to ls, and at the end you have exit)

step 2: getting configuration files

  • Find a Java program which has some configuration
  • "we have no idea how it's configured! How will we ever find out?"
  • run strace -e open -o strace_output.txt the_java_program (I used a Hadoop program). -e means "this system call" and -o writes the output to a file
  • it turns out that this actually doesn't work if the Java program starts child processes -- you usually want to run strace -f.
  • Run strace -f instead
  • grep the output file for .xml because practically every java program is configured with xml
  • we find our configuration file! we are winners
  • mention looking at calls to write

step 3: attaching to a running program, and strace a CPU-only program


while True:

and then attach to it with sudo strace -p. When you're attaching, you always have to run as root. This program doesn't have any system calls!

If you want another cool demo of stracing a running program, find is a good example, run find / and then attach to it with strace -- it'll show you which files find is looking at right now!

step 4: stracing to understand a performance problem

  • (secretly, beforehand) make a tiny flask server that responds to GET /hello with 'hi!' after sleeping for 2 seconds
  • run a small bash script that just runs curl in a loop
  • the script is slow! But why is it slow?
  • run strace on the script, and see how you can see it pause really clearly on the wait. Talk about other system calls you'll often see strace pause (select)
  • now is a really good time to mention that STRACING PRODUCTION PROGRAMS WILL SLOW THEM DOWN AND YOU NEED TO BE VERY CAREFUL. Sometimes you do it anyway if the process is a mess
  • talk about what to do if you don't know what a system call means (what's wait4? You can run man 2 wait4 on any system to get the man page for that system call)


  • Break for questions.
  • mention that I wrote a zine about strace which is an ok basic reference


I thought this went pretty well, especially given that I prepared it only 2 hours in advance. I think I'll do it again sometime! I want to get better at doing workshops and talks at least 2-3 times because preparing good material is so hard, and I always learn so much about the talk/workshop the first time I give it.

If you want to adapt this workshop for your cool friends who you want know about strace or tcpdump, you could! Most likely this is way too sketchy for anybody else to use but me.

Read the whole story
115 days ago
Zagreb, Croatia
Share this story

D.C. Improvisers Collective - Ministry of Spontaneous Composition (s/r, 2016) ****

1 Share
By Derek Stone

In 2014, the D.C. Improvisers Collective released In the Gloam of the Anthropocene, a stellar effort that combined improvised jazz with a myriad of other influences: driving post-rock, shimmering drone, and circular rhythms that approached krautrock in their machine-like regularity. Each iteration of the Collective has undergone slight line-up changes, but Ben Azzara (drums) and Jonathan Matis (electric piano) seem to be the members who’ve stuck around the longest. On this, their newest recording, those two are joined by Chris Brown (bass and electronics), John Kamman (guitar), Ben Redwine (clarinet), and Patrick Whitehead (trumpet/flugelhorn). The result is an album that distills and refines the essence of what the Collective do best: long-form improvisations that seamlessly integrate a number of moods, styles, and rhythmic modes.

The first and longest piece here, “The Division of Unlearning,” starts with a slippery, circular stream of notes from clarinetist Ben Redwine, and steadily adds layers from there. Soon, the band is locked into a tight, swinging groove that seems to owe more to the group’s jazz influences than anything else. However, the D.C. Improvisers Collective is never comfortable sticking with one style for too long - after a few minutes, drummer Azzara lays down a sturdy, relaxed rhythm, Matis’s electric piano takes a more prominent role, and the overall effect is of a transformation from the jazz idiom to something resembling 70’s prog-rock (think: Can, or the jammier side of Pink Floyd) Midway through, Redwine’s snaky clarinet returns, Azzara speeds up the tempo, and (with John Kamman’s guitar sounding straight out of Agharta) the band offers a healthy dose of sinister fusion. It’s this constant restlessness and eagerness to try new things that sets the D.C. Improvisers Collective apart and makes them such an idiosyncratic force; furthermore, they never sound like a cheap imitation or ersatz copy of the music they are tipping their caps too - if anything, the band breathes new life into some of these genres, showing how slight alterations and recontextualizations can resuscitate even the crustiest stylistic corpse. “The Division of Unlearning” ends with a section that sounds as if it could fit comfortably on the eponymous first album from Neu! There are propulsive drums, airy melodies that wind in and out, and an overall sensation of forward movement, joyous and irrepressible.

The other two pieces are much shorter, but they are not lacking in the exploratory impulse that is in evidence all throughout the first. “Unified Conspiracy Theory” starts at an unhurried pace, with the players contributing in brief, leisurely stretches. As the piece develops, though, the relatively stable structure begins to show cracks: Whitehead’s trumpet and Redwine’s clarinet drop their fetters, so to speak, spiralling upwards in progressively more unrestrained eddies; the guitar spits and snarls; Chris Brown weaves a jaunty bass-line throughout. Just when things seem to be on the verge of disintegration, the piece is overtaken by swathes of electronic noise, and the players ride the wave of ambience to its conclusion. “Dark Matter Denial” is a bluesy, swaggering track, Jonathan Matis leading the way with a menacing, yet alluring,  melody on the keys. Compared to the rest of the album, this piece doesn’t really change directions or veer off in unforeseen ways; the percussion and keys are rock-steady, acting as a canvas upon which the other players can paint. At the album’s close, the members of the D.C. Improvisers Collective erupt into laughter, undoubtedly feeling elated at the glorious noise they’d just created. I’d be lying if I said I didn’t feel the same way after hearing it!

Read the whole story
118 days ago
Zagreb, Croatia
Share this story

Emissions Test-Driven Development

1 Share
Code » Emissions Test-Driven Development is a software development methodology adapted specifically to the programming of internal combustion engine control units (ECUs). ETDD is a subcategory of a more general form of software development known as "Adversarial TDD", which is a kind of Test-Driven Development. In Test-Driven Development, the software development cycle begins with the creation of a test case, which naturally fails at first. Once this is done, a minimal amount of code is produced which causes the test case to pass. The process continues, tests first, code second, with the tests forming an implicit specification for the behaviour of the software. If the tests pass, then the software is considered correct. Assuming the tests fully encompass all the desired behaviour of the software, then the software is complete. (Opinions differ on whether this a universally sound approach for creating good software. But opinions also differ on whether there is such a thing as good software; on ...
Read the whole story
118 days ago
Zagreb, Croatia
Share this story

Where Do We Find Ethics?

1 Share

I was in elementary school, watching the TV live, when the Challenger exploded. My classmates and I were stunned and confused by what we saw. With the logic of a 9-year-old, I wrote a report on O-rings, trying desperately to make sense of a science I did not know and a public outcry that I couldn’t truly understand. I wanted to be an astronaut (and I wouldn’t give up that dream until high school!).

Years later, with a lot more training under my belt, I became fascinated not simply by the scientific aspects of the failure, but by the organizational aspects of it. Last week, Bob Ebeling died. He was an engineer at a contracting firm, and he understood just how badly the O-rings handled cold weather. He tried desperately to convince NASA that the launch was going to end in disaster. Unlike many people inside organizations, he was willing to challenge his superiors, to tell them what they didn’t want to hear. Yet, he didn’t have organizational power to stop the disaster. And at the end of the day, NASA and his superiors decided that the political risk of not launching was much greater than the engineering risk.

Organizations are messy, and the process of developing and launching a space shuttle or any scientific product is complex and filled with trade-offs. This creates an interesting question about the site of ethics in decision-making. Over the last two years, Data & Society has been convening a Council on Big Data, Ethics, and Society where we’ve had intense discussions about how to situate ethics in the practice of data science. We talked about the importance of education and the need for ethical thinking as a cornerstone of computational thinking. We talked about the practices of ethical oversight in research, deeply examining the role of IRBs and the different oversight mechanisms that can and do operate in industrial research. Our mandate was to think about research, but, as I listened to our debates and discussions, I couldn’t help but think about the messiness of ethical thinking in complex organizations and technical systems more generally.

I’m still in love with NASA. One of my dear friends — Janet Vertesi — has been embedded inside different spacecraft teams, understanding how rovers get built. On one hand, I’m extraordinarily jealous of her field site (NASA!!!), but I’m also intrigued by how challenging it is to get a group of engineers and scientists to work together for what sounds like an ultimate shared goal. I will never forget her description of what can go wrong: Imagine if a group of people were given a school bus to drive, only they were each given a steering wheel of their own and had to coordinate among themselves which way to go. Introduce power dynamics, and it’s amazing what all can go wrong.

Like many college students, encountering Stanley Milgram’s famous electric shock experiment floored me. Although I understood why ethics reviews came out of the work that Milgram did, I’ve never forgotten the moment when I fully understood that humans could do inhuman things because they’ve been asked to do so. Hannah Arendt’s work on the banality of evil taught me to appreciate, if not fear, how messy organizations can get when bureaucracies set in motion dynamics in which decision-making is distributed. While we think we understand the ethics of warfare and psychology experiments, I don’t think we have the foggiest clue how to truly manage ethics in organizations. As I continue to reflect on these issues, I keep returning to a college debate that has constantly weighed on me. Audre Lorde said, “the master’s tools will never dismantle the master’s house.” And, in some senses, I agree. But I also can’t see a way of throwing rocks at a complex system that would enable ethics.

My team at Data & Society has been grappling with different aspects of ethics since we began the Institute, often in unexpected ways. When the Intelligence and Autonomy group started looking at autonomous vehicles, they quickly realized that humans were often left in the loop to serve as “liability sponges,” producing “moral crumple zones.” We’ve seen this in organizations for a long time. When a complex system breaks down, who is to be blamed? As the Intelligence & Autonomy team has shown, this only gets more messy when one of the key actors is a computational system.

And that leaves me with a question that plagues me as we work on our Council on Big Data, Ethics, and Society whitepaper: How do we enable ethics in the complex big data systems that are situated within organizations, influenced by diverse intentions and motivations, shaped by politics and organizational logics, complicated by issues of power and control?

No matter how thoughtful individuals are, no matter how much foresight people have, launches can end explosively.

(This was originally posted on Points.)

Read the whole story
126 days ago
Zagreb, Croatia
Share this story

PLoS-1 published a “creationist” paper: some thoughts on what followed

1 Comment

As everyone knows by now, PloS-1 published what seemed to be a creationist paper. While references to the ‘Creator’ were few, the wording of the paper strongly supported intelligent design in human hand development. A later statement from the first author seemed to eschew actual creationism, but maintained teleological (if not theological) view of evolution, and saying that human limb evolution is unclear. The paper was published January 5, 2016. However, it seemed not to get any attention.  The first comment on the PLoS-1 site was on March 2, when things blew up on Twitter, quickly adopting the #handofgodand #creatorgate hashtags. (As far as I could tell, the paper URL has not been on Twitter before March 2, except for a single mention the day it was published.)  On March 3, PLoS-1 announced that a retraction is in process.

Open Access is not broken

Probably the strangest reaction I have seen to #handofgod was in this articlein Wired that examined the old trope that open access articles are poorly reviewed. I thought we were already beyond that, and that at least science writers have educated themselves on the matter.  Review quality has nothing to do with the licensing of the journal!  Tarring all OA publication with the same brush, without even saying why open access is relevant to this problem, is simply poor journalism.

Oh, and please stop confusing Open Source (for software licensing) with Open Access (for licensing research works). The two terms stem from the same philosophy of share an reuse, but they are best not conflated.

PloS-1 is not broken

Saying that publishing this paper shows the failure of PloS-1’s publication model, is like saying that because you read a news story about someone who got run over on the sidewalk, you will never walk on a street shared with cars ever again. PloS-1 publishes 30,000 papers per year. It took PloS-1 less than a month to retract from the publication time, and less than 48 hours from when this paper came on the social media radar. In contrast, it took Lancet 12 years to retract Andrew Wakefield’s infamous paper on vaccine and autism; a paper that was not just erroneous, but ruled to be fraudulent, and has caused incredibly more damage than a silly ID paper. Also, I am still waiting for Science to retract the Arsenic Life paperfrom 2010, and for Nature to retract the Water Memory paper from 1988. At the same time, I bet that only few of those who clamored  they will resign from editing for PloS-1, will  turn down an offer to guest edit for Science. Here’s an idea for all PloS-1  editors who are “ashamed to be associated with PloS one”: instead of worrying or self-publicizing on Twitter or PloS’s comment section, take up another paper to edit, and make sure it is up to snuff.

Or, you know what, go ahead and resign; if your statistical and observational skills are so poor as to not recognize  your own confirmation bias, you should not be editing papers for a scientific journal.


We should not move to a system that is exclusively post publication peer review

One argument was made that peer-review failed because this paper got through. OK, people die in car crashes even if they wear seat belts. It doesn’t mean you should never wear your seat belt because you may die anyway. Pre-publication peer-review is a safety valve: it helps maintain a certain level of quality and interest appropriate for the journal at hand. In PLoS-1, that would mean anything that is scientifically sound. In other journals, topical interest as well as gauging a level of novelty or impact may play in as well. Like seat belts, it is not 100% reliable (obviously), and it is hugely problematic (OK , this is where the seatbelt analogy breaks down, pun intended).

Exclusive post publication peer review might be mostly good for those that already established themselves as prominent scientists, and whose papers will be read anyway. I have yet to hear of a postpub plan that helps filter and rank papers somehow. And no, the good science will not always “make it” somehow. Yes, prepublication peer review can be horribly slow and unfair. But doing away with it completely is not a viable solution to publicaiton woes, especially when a viable alternative is not proposed.  But see here for an alternative and interesting, if somewhat open-ended worldview. Also, see below about making pre-publication reviews public.

(Added later)there is also the worry of the mob mentality of postpub review, that may lead the editors of a journal to a harsher response than is actually warranted. This concern was expressed with the swift retraction by PLoS-1.


Alternative metrics measure interest, not quality

The issue of alt-metrics per-se has nothing to do directly with the #handofgod paper, but the number of tweets and Facebook shares of the URL of this article shot through the roof(1446 as of the time of this writing). Alt-metrics advocates keep saying that counts of social media chatter, downloads, and web views are a more reliable metric of the interest in a paper than, say, traditional citations.  (And of course, the much-maligned and manipulated Journal Impact Factor, which even Thompson-Reuters who originated it say  it’s an inappropriate metric for assessing individual papers, authors or institutions). Alt-metric advocates are probably correct in saying that a high altmetric shows a high level of interest on social media, but an interest in a paper, by itself, is not necessarily a good thing. You need some additional metrics to complement it and say if the interest the paper warranted comes from a good place. Also, your paper may merit social media interest and downloads, but not receive it , for various reasons.

If your paper is really bad, you may  get attention on social media, and many views and downloads. If your paper is really good, you may also get attention on social media. But you won’t get attention simply because your paper is really bad or really good. You will get attention because your paper will be an attention getter. If you publish a population survey of fish in an obscure pond over 5 years, and completely mess up the diversity equations, no one will notice. If you publish an interesting variation on how to build phylogenetic trees, you may be heralded in your sub-field, but not much more. If your paper is picked up by a media outlet, or a large journals News section, you will get more attention. But that would mean that your paper is either very relevant to current public interests, in a good or bad way.

Relevant note: one way to guarantee your paper gets high alt-metrics, is to have it discussed on Retraction Watch. You probably don’t want that.

So if your paper is sexy in a good or bad way,  and hopefully it will get tweeted by someone with many followers, you will get a high number of counts. Research idea:  check the correlation between corresponding authors’ number of tweeter followers, and alt-metric count.

We should make reviews public

This is probably the only  good idea I heard so far to help prevent a recurrence of the #handofgod mess.  I am not sure the paper’s reviews and editorial decision will ever be made public, but I am confident that the reviews do not mention the ID issue as problematic, or, on the slight chance that any of them do, the editor did not acknowledge the ID rationale when approving the paper for publication.  We have all received the occasional review that was lazily written, and completely uninformative. They may have been positive and uninformative (“I have no comments, good paper”), or negatively and uninformative (“this paper should not be published in your journal”).  The editor would be forced to get decent and informative reviews, or look for other reviewers. And once the paper is out, we would be able to see how and why it made it past reviews.

Note on reviewer anonymity: personally, I would prefer  public reviews where the reviewers have the choice to remain anonymous.  For good or bad, many scientists’ careers, especially junior scientists, still depend on the good graces of their colleagues; and scientists can be just as petty and vindictive as the rest of the human race.  Anonymity helps the little fish be honest about the big fish without fear of retribution, yes, it may foster less-than-honest reviews from the little fish, but that is why several reviewers are used.  PeerJ and eLife already have public anonymous (or signed by choice) peer reviews.

Read the whole story
176 days ago
"Saying that publishing this paper shows the failure of PloS-1’s publication model, is like saying that because you read a news story about someone who got run over on the sidewalk, you will never walk on a street shared with cars ever again. PloS-1 publishes 30,000 papers per year. It took PloS-1 less than a month to retract from the publication time, and less than 48 hours from when this paper came on the social media radar. In contrast, it took Lancet 12 years to retract Andrew Wakefield’s infamous paper on vaccine and autism; a paper that was not just erroneous, but ruled to be fraudulent, and has caused incredibly more damage than a silly ID paper."
Zagreb, Croatia
Share this story
Next Page of Stories