Summarizing "Humans Need Not Apply"

The ever-popular CGPGrey recently released a video on the potential future economic challenges of AI and automation, "Humans Need Not Apply". The longest video on his channel yet, it has attracted a good amount of attention in the first 24 hours, particularly on reddit, where long discussions broke out in a handful of popular sub-reddits, including /r/Futurology, /r/Economics, /r/videos, and /r/CGPGrey itself.

It's a departure from his typical style, which is usually something of a depth dive on a purely factual, but complex, topic. This time he focuses on the modern problem of how future expansion of AI and automation will affect the human job market. The video covers, at a high level, the current state of automation and how he strongly believes it will soon prove capable of taking over many jobs we typically think of as outside the realm of computer and machinery automation, leaving a lot of people "out of work through no fault of their own".

Having read a lot of replies to the video, I would like to offer my take on Grey's position in a way that will hopefully clarify some points and address some common counter-arguments. I think he made his point clearly, but it may help some to see things restated slightly differently. I will try my best to stick to what was put forth in the video and avoid my own thoughts. (Naturally, I will assume you watched the video.)

Grey argues that how automation is adopted in the future will be very different from how it was in the past. Historically, automation has benefited humans. When a job becomes more automated we obviously reap the benefit of not having to do it ourselves, reap the benefit of having a job performed better than we did it, and the job the machines took over gets replaced in some way. For the large part we still have enough jobs available, particularly for skilled workers. There is a prevalent attitude of thinking that future automation won't pose a significant threat to the job economy because "it's always worked out in the past". But Grey argues that the future of automation cannot be extrapolated from the past:

  • Automation will expand at a speed we haven't seen before. In the past, automation has tended to expand relatively slowly. Usually only a couple industries were revolutionized at a time and it would take many years to make the transition. New generations have time to train for different jobs, workers from one industry can shift to another industry, etc.

    But in the future, automation will invade many industries very quickly. We won't have the luxury of easing into a massively automated world. There are major business incentives to capitalize on automation, and the engineering effort is actually there to deliver. AI used to be a theoretical field of study, but now it's one of the most popular subjects of academic and self-motivated study in engineering and one of the most demanded fields of computer science employment. We're seeing practical growth in this area like we've never seen before.

  • More types of jobs will be automated than ever before. In the past, we've largely seen manual tasks automated. Machines were single-purpose and largely stupid at some level. Highly specialized skill sets, professional occupations, and the such, have typically been safe from automation. Not so anymore. High-paying professions dominated by highly trained, specialized humans will fall prey to automation. There won't be a safe-haven of jobs that we can turn to.

    These have always been our fall-back plan. We've usually been able to give less-desirable jobs to machines and move the workers to equivalent or nicer jobs. But what happens when the machines are doing those jobs too? So far, automation doesn't do the "smart" jobs, it does jobs that we don't want to do (like dig ditches) or that are too complex for us (like calculate an inverse square root millions of times a second), but they don't really do "intelligent" or "judgment" things. That's our domain. Except, Grey contends, it isn't. AI has started to automate things we consider "intelligent" and it's doing quite well. General-purpose computing and learning will allow AI to enter many new domains.

    The implication here is that historically we've seen individual jobs become automated. In the future, it may be entire professions.

  • Grey points out that the 32 most common professions in the United States are over 100 years old, meaning that we have not yet to deal with automation taking a huge chunk of our jobs. And many of those most common professions are solid candidates for the next generation of automation.

In short, we have many times more automation waiting for us in the near future than we've experienced in the past and a good portion of it's going to be for types of jobs we've never seen automated.

Of course, worries about jobs in a future of increased automation isn't new. People have always worried about machines taking their jobs. The novella Manna takes Grey's exact warnings about AI and explores them to the extreme; it was published 11 years ago. Books on the general end of capitalism significantly predate that. Why does Grey contend that this is the time to look for a falling sky?

Because those futuristic machines are already here. The video quickly mentions many jobs for which automations sounds like a futuristic dream but in reality already have existing, successful automation. They may not be mass-produced, mass-marketed, cheap enough for bulk purchase, or finely tuned enough to fully replace a human, but the near-human proof-of-concept has already been built and demonstrated successfully. We already have the blueprints to build them and we've done it, so the "hard" part is behind us, the rest is time and business.

So to fix the problem, what's his call to action? Actually, there's a notable lack thereof. Rather, my primary take-away was that we need to start thinking about these kinds of problems. When we have economic instability, automation taking more jobs than it is creating, and a rising gaps between the poor and middle class, it will be too late to sit down and refactor large portions of society and the economy from scratch. Rather, we need to be constantly aware of the impending problem so that when it starts to manifest itself we can quickly react to it. Being aware of the impending problem allows us to pre-think through the possible solutions and be prepared to adopt them when the time comes.

We will have problems like:

  • How do you treat multiple industries of people who were hard workers with skilled job sets that have been largely obsoleted?
  • How do you maintain standards in living in a country that doesn't need everyone to work?
  • How do you educate the next generation when most jobs are a stone's throw away from being automated?

And so forth. None of those questions are new to philosophers, but a lot of first world nations haven't seriously pondered them, let alone come to any decisions, let alone made any moves toward adapting.

People are resilient, and while it's true that societies can adapt to substantial change, they generally need the change to be gradual. Macroeconomic changes are not exactly agile. One of Grey's main points above is that the changes will probably happen quickly. If we're unlucky, entire industries could disappear from the human job market not in a couple decades, but a couple years. The necessary changes to accommodate it would be substantial.

Because the economic shift will be unprecedented and will possibly usher in completely new ideas about job, career, and education expectations and standards of living, we will need to re-think a lot of how society operates. If re-thinking large portions of society is the only long-term solution, we're going to wish that we'd spent the preceding years thinking about those problems and taking every preemptive step possible.

I like Grey's video (and I also enjoy his musings in general). But once again, here's a reminder that this is just an article summarizing my take on the video, it is not necessarily my personal thoughts on the matter.

ECB Isn't a Mode of Operation

  • By Brad Conte, October 30, 2013
  • Post Categories: Security

ECB isn't a block cipher mode of operation. At least, not for a developer.

In fact, I would suggest that calling ECB an encryption mode is similar to calling "see if it opens in Word after you decrypt it" a MAC. Or even akin to calling "textbook" a type of RSA padding and listing it next to OAEP in a crypto API. From a development perspective, ECB mode, "open it in Word", and "textbook RSA padding" are all examples of a horrible ad-hoc scheme that attempts to address a cryptography problem without using a real cryptographic scheme.

I understand why we formally categorize ECB as a mode of operation, and that makes perfect sense. Conceptually, ECB is the trivial/zero case of operation modes; you might even call it the degenerate case. In a formal setting, labeling the degenerate case under the same term as the general case is perfectly fine.

But ECB isn't a mode of operation for developers of real world cryptosystems in that it doesn't satisfy the requirements we have for all the other modes of operation. Modes of operation were designed to turn the block cipher primitive into a more general encryption scheme. By themselves, block ciphers are just pieces of the overall puzzle. They have security models that are great, but have nothing to do with end-to-end cryptosystems. Block ciphers are built to provide a piece of cryptographic functionality, a piece that needs to be leveraged by a bigger scheme to produce a real world cryptosystem. ECB is a "mode" that doesn't do that.

Calling ECB a "mode of operation" in a software interface implicitly confers a status upon ECB that it doesn't deserve. It implies to all who see it, "ECB is an official mode of operation that fulfills some of the needs of a mode of operation," when the reality is that it does not. Inappropriate use of ECB mode (which is pretty much always "using ECB mode") is one of the most classic cryptography mistakes in software development and allows simple vulnerabilities that a middle-schooler could be trained to exploit. ECB is the zero case, and giving the zero case a name that doesn't sound like the zero case has hurt the community of developers who use it.

Consider a typical crypto API. The user specifies a parameter to the library choosing a mode of operation for the block cipher. What kind of options do they have? If they're using OpenSSL, Crypto++, .NET, Java, or PyCrypto, then it is a list analogous to this:

    // [...]

The options might be classes, maybe the function names use them as suffixes, etc, but the options are grouped together thematically like that. In a case like this, the label "ECB" is a lie because ECB isn't just another one of the choices. Instead, this would be a more honest list:

    // [...]

This clearly labels which one is the "0"/"off"/"none" option. "You aren't using a mode of operation, so implicitly you're missing the benefit of using one," it says.

Obviously, developers should research what they're using. And to be fair, some APIs (including some listed above) have the courtesy to point out ECB's security pitfalls in the documentation. The documentation is generally an incomplete warning, saying something like "warning, ECB mode leaks information about plaintext". But how does the developer know if that problem applies to their use case? Just how bad is that? A warning in documentation requires the developer read the documentation (reasonable), assess the severity of the problem presented (somewhat reasonable), derive or research all the other problems ECB brings (semi-unreasonable), then realize that ECB is nothing like what they're looking for. Relabeling "ECB" to "NONE" hints at a lot of that up-front.

Imagine if we did the same thing for RSA. "Textbook" RSA (aka, encode the message as an integer and exponentiate it) is also a cryptographic primitive and it shouldn't be used in real cryptosystems either. Proper use of RSA requires that a padding scheme be applied to the plaintext first, but imagine if RSA functionality was exposed in APIs with a variety of padding options like "OAEP", "PKCS1.5", and "textbook", all presented together like they're the same thing. "Textbook" padding is just writing the plaintext as an integer with leading 0s. It's the zero/base case, like ECB for block ciphers, and has no business being listed next to the other padding options. Choosing "textbook" RSA padding is really choosing no padding, since the padding scheme isn't addressing any of the issues the other padding schemes exist to address. (Thankfully, I've never seen an API offer "textbook" as an RSA padding scheme.)

Even academia notes that ECB is out-of-place when modes of operation are considered as a whole. Consider "Evaluation of Some Blockcipher Modes of Operation" by Rogaway. When it summarizes ECB (pg. 5) it points out that ECB is the black sheep of the operational modes family, not meeting the same pattern of practical usefulness that the others do:

[...] ECB should not be regarded as a "general-purpose" confidentiality mode.

Nothing in this article is new. Yes, my complaint is largely semantics, and obviously the perils of ECB have been preached for a very long time. As well, developers are responsible for the code they write. But misleading interface choices do bear a portion of the blame when the masses use (and abuse) them, and I believe that using the label "ECB" under "modes of operation" in developer-oriented APIs and documentation is misleading. "ECB" is a term best relegated to formal cryptographic speak. It should not be used anywhere someone is building a real-world cryptosystem.

Unfortunately, this is just one problem among many, as cryptography libraries have a long history of making it easy for developers to misuse them. ECB is just one example of presenting bad choices to developers. Part of the bigger picture solution is to have good cryptography interfaces that hide lots of details, allowing as few potentially fatal decisions as possible. But I still find the ECB issue particularly annoying, primarily because it's rooted in semantics. The fact that real-world cryptography problems start with vocabulary choices is sad.

Certificate Verification Using load_verify_locations()

  • By Brad Conte, September 13, 2013
  • Post Categories: Security

I was writing a small Python script recently to use a web API over HTTPS. To use HTTPS properly, I naturally wanted to validate the server's certificate before trusting the HTTPS connection. Sadly, doing something so simple required more digging than I would have thought.

I was using Python 3's http.client module for HTTPS, so I needed use the ssl.SSLContext.load_verify_locations method of my SSL context to specify the path of the trusted CA certificate store for the cert verification. I wanted to use the pre-existing CA store on my computer, which I thought should be straight forward.

Like many crypto APIs in higher-level libraries, the Python function turned out to be pretty much a pass-through wrapper to the OpenSSL API of the same name. I hadn't used that API before, so I read the brief Python documentation, then skimmed the OpenSSL documentation from which I gleaned:

int SSL_CTX_load_verify_locations(SSL_CTX *ctx, const char *CAfile,
                                   const char *CApath);

If CAfile is not NULL, it points to a file of CA certificates in PEM format. The file can contain several CA certificates identified by

 ... (CA certificate in base64 encoding) ...

sequences. [...]

If CApath is not NULL, it points to a directory containing CA certificates in PEM format. [...]

That seemed relatively straight-forward. I looked at the "ca-certificates" package on my Arch Linux system to see where the system's default CA certificates where installed. (The ca-certificates package is a bundle of default (and often pre-installed) CA certificates deemed worthy of shipping out-of-the-box by organizations like Mozilla. Other Linux distros may use a different name for the package of certs.) I tried load_verify_locations with the CApath argument pointed to the ca-certificates store, which in my case it was /usr/share/ca-certificates/mozilla. Even though the PEM certificates were there (the ".crt" files are PEMs; since PEM allows for a concatenation of ".crt" file contents the ".crt" files are PEM as a trivial case), the Python module didn't find the certificates and the HTTPS connection returned an SSL: CERTIFICATE_VERIFY_FAILED error. I tried pointing the CAfile argument at the X509 cert files, but that failed too.

I searched for example usage of load_verify_paths in Python and C. I was very disappointed to find no useful examples among the first 20 or so. I found a lot of code that ignored certificate validation (which was disturbing), code that only used arguments pulled from external settings, and code that used paths to app-specific certificates. But no literal examples of using the environment's existing store.

I went back to the OpenSSL documentation. Whoops, I hadn't read it fully:

If CApath is not NULL, it points to a directory containing CA certificates in PEM format. The files each contain one CA certificate. The files are looked up by the CA subject name hash value, which must hence be available.


Prepare the directory /some/where/certs containing several CA certificates for use as CApath:

OpenSSL needs the cert directory to be setup first. The cert filenames within the setup directory are going to be the CA name hash. OK, where was that setup on my system?

After more searching I stumbled across a very recent helpful comment on a Python bug report. (Less than surprising, the bug was advocating that Python should make the system CA cert store easier to access, although the suggested method itself was not a good idea.) The comment provided a list of example paths for CApath and CAfile on different common *nix systems:

cert_paths = [
    # Debian, Ubuntu, Arch, SuSE
    # NetBSD (security/mozilla-rootcerts)
    # Debian, Ubuntu, Arch: maintained by update-ca-certificates
    # Red Hat 5+, Fedora, Centos
    # Red Hat 4
    # FreeBSD (security/ca-root-nss package)
    # FreeBSD (deprecated security/ca-root package, removed 2008)
    # FreeBSD (optional symlink)
    # OpenBSD
    # Mac OS X

These are locations setup for OpenSSL's load_verify_locations use. On my system, the majority of the files were symlinks to certs in the store, where the symlink name was the CA subject name hash. The tools c_rehash (a part of OpenSSL) and update-ca-certificates (a part of ca-certificates) setup this path for OpenSSL.

It was a little disappointing that there was no de-facto path, environment variable, or something of the like to find the prepared OpenSSL-ready CA store. But while you can't rely on the CA store setup to be in the same place across platforms, if this list is true then /etc/ssl/certs/ seems very popular choice on Linux.

So in my Python script, I used:

ssl_ctx = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
ssl_ctx.verify_mode = ssl.CERT_REQUIRED
ssl_ctx.load_verify_locations(None, "/etc/ssl/certs/")

and got SSL certificate verification to work using my pre-installed certificate store.

Indeed, the solution was indeed fairly straight-forward. I was just surprised by how long it took find it. I had originally thought there would be lots of examples since, well, scripts aren't carrying their own CA list around and they are performing certificate validation. Right?

On that disturbing note: Please, devs, do verify certificates! HTTPS is just HTTP with obfuscation if you don't do cert validation. Anyone can MITM the connection with a self-signed cert. Even if the application is doing something mundane, like some seemingly boring API queries, you probably still want to protect your API key. If using the pre-installed cert store was confusing before, I hope this article helped address that issue.

What Exactly Does "Rooting" Mean?

There is a lot of interest in "rooting" Android and iOS devices so that the owner can do more fun things with them. But most rooting guides are simple "how-to" lists without much explanation of how the process works and what is being done. This can make the process feel arbitrary and complicated, particularly for beginners. But it's not complicated, and if you're going to root your device, you're better off knowing as much as you can about it.

This is a conceptual and technical overview of what rooting a modern mobile device entails. We'll cover the whole pictures of rooting in simple, non-technical language. (If you've rooted a device before then you've probably inferred a fair amount of the information here.) The concepts should apply to both Android and iOS (although examples will favor Android since I have more experience with it).

Here's an outline of the topics. We'll work backwards.

What is "Root"?

Root is a term that refers to the de-facto ultimate administrative account on a device. It's a term that originated with Unix systems a long time ago. On a Unix-like system (Android, iOS, Linux, and OSX are Unix-like in concept), there is an account literally named "root" that exists by default and has the administrative privileges to do anything.

Rooting a device is simply a process to obtain access to that root user account. This requires effort because operating systems like Android and iOS run the normal user environment under a non-root user account that has privilege restrictions. To do something that the default user account can't do usually means the user must get root access to the device and use the root user account privileges to accomplish their goal.

There are perfectly good reasons why the phone doesn't run with root access by default. Root privileges allow you to do just about anything, which means you can screw up just about anything. The principle of least privilege is a fantastic security and stability adage; root privileges are best left to those who explicitly want/need them.

How Do You Get Root?

There is a program called su (also with old origins in Unix) that allows a user to open a new session under a different user account. This program allows users to effectively switch to a different user, including the root user, and perform tasks as that other user. Naturally, you must have access to the target user account to su to it.

Rooting a device is basically just the act of placing an su program onto it. Apps that want root access run this program to get a session under the root user and then they perform their root-requiring tasks with that session.

How Does "su" Grant Root Access?

A typical program runs with the permissions of the user who executed it. You can't run a program as the root user without first effectively being the root user. This creates a chicken-egg problem for getting root access because the user starts with a non-root account.

But this is hardly a new problem, and there's a simple solution. Unix-style systems have long had a file permission called the SUID (Set User ID) permission. When a file has the SUID permission set, it means "regardless of who runs this program, run it with the permissions of the owner of this file". When you install the su program, it will be installed as owned by the root user with the SUID flag set, so when a non-root app runs su later, su will run with root permissions, giving a root user session to the app that called it.

But having su hand out root access to anyone/anything that executes it is a really bad idea because there would be no protection for the account. So the a mobile-device version of su usually has a user prompt that requires the user to authorize attempts by su to get root access.

How Do You Install "su"?

To install su you must place it to a findable location, have it owned by the root user, and have the SUID permission set. Installing normal apps is easy but doesn't work for su. Normal apps can't be owned by root, can't install to key system locations where other apps can find them, and will belong to a user account with limited permissions. We need a way to take an su file that is not owned by root and make it owned by root.

Now we have another chicken-egg problem: For security reasons (such as this very process) you can't change a file's owner to be another user without that other user's permission. So we can't change a file to be owned by root without root access.

The solution is to get temporary root access in order to install su and give it the proper ownership and permissions, after which it will serve as a permanent anchor for root access.

How Do You Get Temporary Root Access?

This is the hard part of the rooting process, and one of the main reasons why there are so many different rooting methods. They all involve side-stepping the normal operating system security checks in some way. Some devices make it easy, some don't. Most of the solutions fall into one of these categories:

  • Boot to a different operating system and install "su".

    This is the probably the "cleanest" method. In a different operating system we can do whatever we want to the device's main operating system because it won't be running and can't enforce its permissions. Booting into a different OS usually entails unlocking the boot loader, after which you may boot to something like a custom recovery image, which is essentially a mini-operating system itself. This is one reason why unlockable boot loaders are such a big deal to modders in the community, if it's unlockable then rooting the device is straight-forward.

  • Find a security exploit that gives admin access.

    Sometimes security bugs in the operating system allow for exploit code to be executed in a privileged environment. For example, years ago iOS 4.1 could be exploited by a maliciously crafted PDF that would exploit the OS to gain root access, and that temporary root access could be used to install the su program for permanent access. Many security holes get patched eventually, so the community has to find new ones. (Security holes you can exploit can also possibly be exploited by others, so it makes sense that they get fixed.) Finding security exploits isn't always easy, that's why it sometimes you have to wait a while after a device is released for a rooting method to be made publicly available. (Although, there are rumors that sometimes some distributors include relatively easy vulnerabilities just as a nod to modders.)

  • Attach through a debugger.

    Sometimes administrative tasks can be performed through developer debugging support. Debugging support is aimed at developers and turned off by default. After it is enabled, it can often perform a variety of admin functions.

You only need to get root access once. Any subsequent task, including switching the su program for another one or updating it, can be channeled through an existing root session or a new root session from su.

What Is Necessary After "su" Is Installed?

After the su program is installed, a normal app to manage it should be installed. For example, on Android the popular ones currently are SuperUser and SuperSU. Such an app will try to manage the su program, such as by upgrading it, allowing for the user to configure settings for it, etc, since su is usually fairly minimal by itself. Not all su programs are identical, you should make sure that the one you install matches the su-management app you install later.

You may also need to worry about preserving root access through OS upgrades, such as OTA updates. When the OS is updated sometimes directories get cleaned out and re-installed. If the su program gets removed, root access is lost. This will be addressed more below.

How Do You Detect if Root Is Available?

Any app that needs root access has to be able to find the su program. It is usually placed in a well-known, typical folder of common system programs. Apps that need root access then check these common locations to see if they can find anything named "su".

Some apps refuse to run on rooted devices. They can detect a rooted device in the same way a normal app does, by finding a program named "su". Any program named "su" is obviously suspicious, but detection can go beyond that and check for any file owned by the root user with the sticky bit set. If the detection is very aggressive, in theory it may also pick apart an executable program to see if it makes any operating system calls that change the user account.

How Do You Preserve Root?

Given the above two sections, some users will face a threat to their root access at some point, be it due to an OTA update or because an app that doesn't work on rooted devices. Sometimes the user needs to preserve root access, even if they concede to losing it temporarily.

Keeping the "unrooting" temporary is simple: open a root session, use it to back up the su program, remove the original, and then restore the back up later. This is possible because a root session can be opened and preserved even once the su program is gone. This can save su from being blown away by an OTA update and it can be used to hide it from apps that complain if the device is rooted. To really evade active detection, they can try to hide su in even more creative ways, such as by removing the sticky bit and change the file owner, which look suspicious.

However, temporary unrooting is risky because the only hook into root access during the process is a temporary root session. If it is lost, then root is lost. This is why unrooting guides always warn against doing anything risky with your device during the process. The obvious big problem is rebooting, if you reboot then you'll lose your active root sessions and if su wasn't properly restored you'll be permanently unrooted.

Detecting and evading root detection is an open-ended problem, and the evaders (aka, users) often have an advantage. But it doesn't seem like most anti-root apps are overly aggressive about their detection, often they only do enough to be able to claim that they're avoiding rooted devices. (Think about legal media consumption apps that just want to convince the content providers that the app will check for rooted devices before downloading copyrighted content.)

Risks of Having Root

Having root access is not without risk. Any apps with root access have unlimited access to the device, including all the apps. Obviously some apps store things like login credentials, but others might store even more sensitive information, like the Google Wallet app on Android which has to store credit card related information locally. I'm sure Wallet has a lot of protection around the information it stores, but the bottom line is that since Wallet can unlock that information another app might also be able to. Hence the reason why the Wallet app offers a warning when it detects the device is rooted.

Even if you trust the apps you grant root access to and are sure they won't intentionally abuse it, what if yet a different app finds a vulnerability in one of them, uses them to get root permissions, then uses those root permissions for evil? You must realize that your security trust perimeter is expanded in a grey-ish way to any app you install, and fully extended to any app with root permissions, including su itself. You are trusting that they won't do stupid or insecure things with their root privileges.

Checking "su"

If you have rooted a device, you can confirm whether it uses the su method described in this article.

  1. First get a shell/terminal/command prompt on the device. (Use a terminal app, install and connect to an SSH server, or use developer debugging tools, etc.)
  2. Find the su file. Popular locations on Android devices are /system/xbin/su and /system/bin/su. If your device has the find command, you can find su using the command:
    find / -name su
  3. If su exists, check its permissions using:
    ls -l /path/to/su

    An example output:

    -rwsr-sr-x root     root        91980 2012-11-03 03:06 su

    This indicates that the file is owned by user "root" and group "root" and the two "s" flags mean that the SUID (run with owner's user permissions) and SGID (run with owner group permissions as well) flags are set.

You can manually use su to open a root shell on your device at any time just by opening a normal command prompt and entering su as a command. (At least, I can with the ones I've tried, YMMV.) You should get the standard confirmation prompt followed by a new root shell.


A typical rooting session on your device typically needs the user to:

  1. Temporarily obtain admin access.
  2. Install an su program that is owned by the root user and has a special "inherit root user's permissions" file-level permission set.

Just about any method that gets that done will suffice.

As a side note, the term "root" as a verb isn't new. It's long been used by hackers who manage full administrative access to a hacked machine. It's actually pretty much the same thing.

Hopefully that provides some context and perspective on the rooting process and how it works.

A Hacker News Parody Thread

I spent some free time over the last few days putting together a parody comment thread for the news/link aggregator Hacker News (HN). (This parody isn't officially affiliated with Hacker News whatsoever.) As with any community, HN has its quirks and predictable comments. It has an interesting mix of young entrepreneurs, highly skilled senior engineers, web developers, managers, etc, which can make for an interesting mix of discussion.

For fun, I wrote a mock comment thread for a hypothetical link to a tech guru blog post. The parody comments attempt to humorously encapsulate the quirks that stand out to me when I read such a comment thread. Such quirks include things like differences in attention to detail, which posts garner the most replies, knowledge grand-standing, which comments end up at the bottom, and popular off-topic tangents. Some of it isn't unique to HN, but it was fun to include anyway.

Ignoring the rule of not explaining your jokes, here is some of the effort I put into the parody. (Don't read this before you read the actual page.)

  • Some of the comments are meant to be taken verbatim, others are just meta-commentary on their content. The two styles are interspersed without any markings to distinguish them. I was afraid that giving the two types different formatting would detract from the formatting parody so hopefully it will be obvious which way they should be taken.
  • All the URLs are mini-jokes. (The "reply" URLs are meta-jokes about potential replies that didn't work as well as actual comments, usually because they're the type of thought or comment that we I consider briefly before moving on. The links to HN resources are commentary about those resources.)
  • The timing of the posts isn't arbitrary, and some of the positions of comments with their time ordering are mini-jokes too.
  • Some of the usernames are commentary on the type of person who I think of when reading such a comment, some of them are just gibberish, and some are just juxtaposition jokes.
  • All of the numbers in the page source are mini-jokes themselves.
  • I kept the downvote arrows for those who crave, but have not earned, the ability to downvote comments. (Voting does nothing, obviously.)
  • I used the actual HN page markup, although I now hate myself for doing so.

I hope I covered my bases for parody work. I tweaked the main logo and site title, none of the page's resources are being pulled from the original website, and there's a big "parody" banner at the top.

It's worth noting this is not the first Hacker News parody: HN front page parody.

It was fun to make, and I hope HN readers enjoy it. Here's my account on HN and the parody's submission to HN.

[Edit, 1.5 hours later]: The response on HN was fantastic - thanks guys! The comment thread is at least as funny as the parody itself, one should definitely read it after reading the parody.

My First Float Tank Experience

I don't think that most people would consider lying in a dark, soundproof box for over an hour to be relaxing. But, I'm not most people.

I recently tried an isolation chamber, aka "float" tank, for the first time. A float tank allows the participant to come as close as possible to not experiencing any of their physical senses for a prolonged period of time (hence another name, "sensory deprivation chamber"). The the tank allows virtually no light or sound and has a shallow pool of highly dense salt water that keeps a human body afloat, permitting you to float while touching, seeing, or hearing anything. The goal is that, once inside for a while, you feel like you are disconnected from your senses while you float in nothing.

People have various motivations for using float tanks. Using them for just 40 minutes can alleviate stress, allow the body to heal injuries more efficiently, allow muscles to relax, and provide a other skin and edge case medical benefits. Some people, like me, just find the idea relaxing. Most people I talk to don't think they would enjoy the experience (and they're probably right), but I'm a very introverted person and I spend much more time inside my head than out of it. When I'm thinking I find external stimuli to be distracting, too much of it can be annoying or even tiring. I enjoy having quiet time with very little stimulation, and a float tank is the quietest session you can have. When I found commercial sensory deprivation chambers being marketed as flotation tanks I was instantly intrigued and bid my time until I had the chance to try one.

My Experience

I went to a local salon and spa, which offered a couple of float tanks among its services. Float tanks aren't too easy to find, but after some searching around it seems that most metropolises have at least one place that offers them. The float tank itself was essentially a large, covered bathtub in a small, dark room just a couple feet wider and longer than the tank itself. The procedure was to shower, enter the tank, close the hatch behind yourself, and an attendant would knock on the room's door once the time was up (taking further measures to wake you if necessary). I opted not not use background music (recommended) and to float in the nude (to avoid feeling any clothing, also recommended).

Closing the hatch behind me for the first time felt odd. I can't say I've ever stepped into a small box with no practically no light or sound. There was a sudden rush as I could almost feel the light and sound leaving my brain and I was suddenly very aware of how much of both I had been processing just before closing the hatch. The tank was virtually sound proof; I couldn't hear anything; no hallway chatter, no honking cars, no footsteps, nothing.

The Physical Aspect

The water was just one foot deep, but the extreme salt density made that plenty to keep me afloat. I extended my arms and legs to touch the sides of the tank and center myself, then pulled my limbs slowly off the side to let myself sit motionless in the middle. This was tricker than it may sound, since the slightest bit of momentum can cause drift and eventually touching the sides. I had to try a couple times to succeed.

At that point I experienced a very unique feeling. I saw nothing, heard nothing, and felt almost nothing. The water was body temperature and whenever I was motionless for an extended period of time the water feeling would subside to being minimally noticeable. But any movement or conscious thought about it would allow me to feel it. The feeling wasn't distracting by any means, but it was still a sensory connection to outside world.

One of the keys to floating is to relax as much as possible, both mentally and physically. Relaxing physically was actually a bit trickier than I had expected. I tend to be somewhat highly strung internally and I often tense muscles without realizing it. (I believe most people do this to some extent or another.) It took longer and more focus than I expected to relax all my muscles. After about 5 minutes I realized that I still had some of my facial muscles on the left side of my mouth tensed slightly, a little later I realized my right quadriceps were a little tensed, yet later I realized I had re-tensed my face, etc.

After about 20 to 30 minutes of being completely motionless it felt like my muscles were almost dead. While I knew I could move any muscle I wanted to, it felt like it would require tremendous effort to do so. At one point I twitched my foot, just for fun. It felt like there was a 10 pound force working against my foot as I twitched it. I think it may have produced some muscular benefits, since I felt several brief localized muscle spasms that were possibly tight muscles relaxing.

The sensation of lying in the tank was nothing like lying in bed. For one thing, my posture, suspended in the water on my back, let my head sit farther back than it would if I were lying on a normal hard surface. Initially it was a bizarre feeling, since it felt like my head was sitting too far back and of my control, but I got used to it. The rest of my body was held in a perfectly comfortable floating equilibrium. You can still feel a bed, the sheets feel soft, the mattress offers firm, albeit ignorable, resistance. The float tank offered no sensation or feeling. It wasn't snuggly, warm, or just kind of quiet. It felt like as close to nothing as possible. (Interestingly, tests have shown that replacing dense water with a bed does not provide the same benefits.)

The Mental Aspect

Once I was centered and relaxed I was very comfortable and felt completely alone with my thoughts. I let my mind wander for some of the time, and I let myself focus my thinking for other times. Aside from my own heartbeat, it kind of felt like time stopped.

My brain felt so unencumbered while thinking. It was like a CPU able to run a dedicated process without interruption from I/O and other processes seeking time-share. When they were focused, my thoughts were in one of those extremely laser-like grooves that come along only occasionally. I had compiled a general list of things to think about in the tank before hand, and without going into specifics they covered various ideas from my programming projects to philosophical quandaries. I was able to analyze and organize things very quickly, and had extra time to pursue other ideas that came up I felt no time pressure while thinking, I moved from step to step as I felt comfortable doing so.

My biggest motivations for using the float tank was relaxation along with my thoughts. I was not disappointed.

Other Things

Unfortunately, I did make a mistake. At some point early on I instinctively touched my face with my hand, probably to scratch an itch. At about the 15 minute mark I opened my eyes, just to see how much light there was in the tank now that my eyes were adjusted (answer: almost none, I could barely make out the walls 2 to 3 feet away). That allowed some very salty water to run into my eyes. I was doomed because I couldn't get it out of my eyes with my salt-water covered body. I tried ignoring it, but after my eyes burned for 20 seconds I gave in. I exited the tank, wiped my face off with a towel, got some water from the shower and cleaned my face and thoroughly flushed my eyes, and got back in. The problem took only a minute to fix, but it was still a disruption.

I ended up floating for a total of 1 hour 45 minutes. That's a long time to be without any physical stimulation, but I really enjoyed it. I came out of it very relaxed and feeling pretty good. I didn't even fall asleep once, although I had kind of expected to. I would do it again.

Advice to Potential Floaters

Based on my experience, here is what I would offer to anyone planning to try float session.

  • Spend a minute in the beginning getting yourself positioned. - You don't want to touch any of the sides of the float tank. Unfortunately, any bit of momentum causes you to drift, and if you start drifting you likely will probably bump into a side. Any time I made any noticeable movement I extended my arms and legs to the side until they touched the sides, used them to center myself, then slowly withdrew them.

  • Avoid getting salt water in your eyes. - This may seem extremely obvious, but it's worth emphasizing. Don't even get salt water on your face, and avoid opening your eyes regardless.

  • Spend some time focusing on relaxing your muscles. - I think that focusing on your body too much would defeat part of the purpose of floating, but it's worth spending some time up front intentionally relaxing all your muscles. It may not be as easy as lying down and telling yourself to relax. I'd recommend spending 5 or so minutes just focusing on relaxing every muscle from your face to your toes. It's very easy to tense them unintentionally.

How do you know if you would enjoy floating? It's probably impossible to know short of actually doing it, but here's a pseudo-test to screen out some who definitely would not like it: Take a pair of the best earmuffs or headphones you can find, put them on, and lie on a bed in a dark room without a pillow for five minutes. If you feel like ending before the time is up, you would probably not enjoy it. If you find it relaxing, floating may be enjoyable for you. (I enjoy doing that sort of thing, that's why I was fairly certain I would enjoy a float tank.)

A Math Major on Khan Acadamy's Exercises

  • By Brad Conte, December 13, 2012
  • Post Categories: Math

I finished all of the practice exercises on Khan Academy. I have to say, it was fun.

Profile bar after finishing all exercises
(My Khan Academy profile)

I'm not exactly not their target audience, though. I received a B.S. in pure math about 3 years ago, so my math background is far beyond the current content offering of Khan Academy (which is basically U.S. high school AP).

I was motivated to try the exercises for a couple reasons. First, I hadn't really checked out Khan Academy before, despite the fact that it had generated a lot of interest on the Internet over the last couple years, I saw it referenced a lot as a math review resource, and I heard undergraduate engineering students swear by it. Second, I'm a big fan of doing consistent mental exercises. I definitely exercise my brain during the work/reading/fun cycle, but I like having some sort of consistent activity to warm up my brain. I usually prefer puzzles that require focus but don't stretch your brain. I hadn't had a set of such exercises in a while and I thought that basic math review would work well for that purpose.

I did all the exercises and watched a small handful of videos. My comments are only about the exercises, not the videos.

Doing the Exercises

I worked through the exercises by following their exercise dependency tree. It's an inter-connecting tree of all the exercises and how they relate to each other. The top/root of the tree starts with an exercise on basic one-digit addition, branches out through the rest of arithmetic, continues into geometry, algebra, etc, and then eventually collapses down to Differential Calculus. There are 37 groups, each with between about 3 and 20 individual exercises. From the zoomed out perspective you can see how all of the high-level groups relate to each other and if you zoom in you can see how the exercises within each group inter-relate, with a few of the edge exercises connecting to exercises from nearby groups.

I followed the tree very methodically from the top down, working left-to-right when a tie-breaker was needed. Exercises are presented in sets of 8 questions and you earn "proficiency" at an exercise by demonstrating sufficient skill at the exercise. You can choose to do problems just from one specific exercise or a mix-up of questions from all the exercises in the group. I chose to do just one exercise set at a time because I enjoy getting into a focused "zone" while solving problems and I was largely doing this as a brain warm-up. I'm not sure it was best, though, if I continue doing exercises I will probably use group questions.

I worked through the exercises at varying rates. In the beginning I would usually do exercises for a couple 10 minute sessions over the day, one in the morning and another one at lunch or during the mid-afternoon. After a couple weeks I enjoyed the process more, especially as I moved past the elementary exercises and into the more fun ones like probability and systems of linear equations. I had never used a math question-answer format like this before, being able to get a stream questions in an easy-to-answer format was downright fun. I found myself putting more free time into the exercises. After all, who doesn't love to do some math problems here and there? (It was a rhetorical question. The majority of you can put your hands down now.)

As expected, the exercises weren't challenging. As a math major, I was not only well-trained in mathematical thinking but I also used the principles of geometry and algebra constantly, so there was nothing there I was not familiar with. However, a handful of exercises used tricks (like converting repeating decimals into fractions) or identities (like trig identities) that didn't come to mind and prompted me to use the "I'd like a hint" button to see a solution strategy.

To make the exercises more engaging I restricted myself to not using scratch-work or a calculator whenever it was possible. This made some of the exercises more difficult and lead to various mistakes in some of exercises requiring multiple steps. It was a good challenge, though, and some of the exercises forced me to keep more objects in working memory than was comfortable. I actually had a few exercises where I constantly made mistakes due to bad mental work, typos, or hasty shortcuts (such not bothering to check whether 221/299 can be reduced by a factor of 13).

While I worked from start to finish fairly linearly, I didn't race through the material. I backed up and re-did exercises that I found particularly fun and re-did some exercises using slightly different methods. (Although I'd estimate only 1/5 of my total points came from re-doing exercises, I think they are worth fewer points once you are proficient at them.) I finished all 380 exercises in about 2 months of consistent work. Khan Academy is constantly adding content, so I plan to do future exercises they add.

The Educational Experience

Although I couldn't see the exercises from the point of view of a student with little mastery of the content, I had some thoughts on their educational value.

I think that the biggest benefit of the Khan Academy exercises is that they can supply an unlimited number of practice problems and provide immediate feedback and explanations. This is something that's hard to do in a non-computer setting. A teacher can only spend time on so many examples during class and homework doesn't give a student immediate feedback. The strength of automation is that examples are infinite and feedback is instant and I think they leveraged both of those aspects well.

The tree of interconnected math concepts was well done. I wish I knew of a similar detailed tree for higher math topics. A very select few of the tree relationships did seem backwards, where I had a hard time believing that the a latter exercise offered any challenge a former exercise didn't. (I now wish I had saved examples, but I didn't.)

The exercises did a good job of focusing on one thing at a time. When learning any new skill it usually helps to isolate it in a familiar environment and focus on the unfamiliar part, so by focusing only on one idea per exercise these exercises would likely be a very helpful learning aid for someone who is trying to improve a specific weak point in their math skills. Combined with the tree as a whole, it would probably be easy to backtrack from an exercise full of confusing ideas to the lowest point of weakness and then work on refining the relevant skills from the bottom up. I can see a coach being able to do this easily for a student, or a motivated student doing it them self.

The system uses some clever math and machine learning to estimate when students are have achieved proficiency in an exercise (but all the student sees is a progress bar). Not knowing this when I began, it was still immediately clear that solving the first several problems in a set quickly and accurately earned proficiency but a couple of mistakes required numerous correct problems to earn proficiency. The system was well-tailored and let an initial burst of obvious competence pass quickly, while not passing mostly-right-but-still-struggling performance.

The "I'd like a hint" button was not the learning resource I hoped it would be. It offered step-by-step solutions (allowing the user to reveal one step of the solution at a time) but the steps were very formulaic without much explanation. For example, a typical step would read like "next we take the X and blah it with the Y", but with no mention of why this was necessary or possible. The exercises definitely were not teaching resources by themselves. Each problem did have a link to the associated teaching video, though, so they weren't designed to stand-alone.

Here are some of exercises that I thought were particularly interesting:

  • The group of triangle proofs. This set allows for a unique chance to walk through geometry proofs step by step. The way they set it up, each step was verified as you input it, so it was pretty much impossible to get it wrong. The geometry diagrams would light up the relevant portions each time you input a new step or hovered over an old step, so I think it would give nice feedback for students who have a harder time "seeing" the geometrical aspects of what they're doing.

  • The exercise on derivative intuition. A simple introductory concept that was well-illustrated. I'd recommend all beginning calculus students do this exercise. There were a couple other exercises oriented at building "intuition" for a topic. I really liked this, since I'm in favor of teaching intuitive concepts. (These should be replaced by rigor later on, of course, but many students, such as myself, benefit from these sorts of "gimme a clue or view as to what's happening here" explanations.)

  • The group of logical reasoning. A quick review of the basic steps of logic and the relevant syntax. This is typically covered in the first few weeks of a first-year introduction to math college course for math or computer science students. I'd recommend everyone in such a class go through the exercises to help cement those ideas.

Unfortunately, some of the exercises lacked diversity in the problem sets. As examples, when reducing fractions there were never common factors larger than 13 (possibly to keep it mentally doable, though), polynomials often seemed to be picked out of only a few sets of styles, and there were a limited number of triangle ratios used. Some of the problem sets were constrained by the intent of the problem, such as trying to pick a problem that resulted in clean answers, and those constraints left only a handful of possible starting values. Those were understandable, but some of them didn't have any such constraints. With problems that didn't vary much, a student could unconsciously get into a rote system of "take the number there, blah it against that number, blah the result and write it down". The entire point of Khan Academy is self-study so I'm not suggesting it needs to take anti-cheating measures, but there were some problems sets that seemed unnecessarily bland considering the pool of potential problems. With some of those exercises I would actually consistently get the exact same question twice within a span of eight questions.

There weren't many exercises on the deeper/more challenging topics. The practice exercises don't cover as much depth as the video lessons do and I would like to see more of the topics fleshed out. For example, there are very few Linear Algebra exercises yet many Linear Algebra videos. Calculus is limited to Differential Calculus only. Even in Algebra they covered a lot of specific skills, but left out some worthwhile ones. They're consistently adding new exercises, though, (they've added 13 exercises in the past three months alone) so hopefully these areas will be expended in the future.

Miscellaneous Observations

On grading:

  • You can achieve proficiency at any point during the exercise set, not just the end. You can keep an eye on the green star in the left-hand bar to see when you get it. Once achieved it can't be lost, although it can be recommended for review.
  • I could almost always pass a problem set in four questions if I answered them correctly and within their "fast" time slot.
  • The most efficient strategy that I found for passing a problem set was to get the first five questions right with three of them done at "fast" speed. Focusing on speed for all four would encourage hasty mistakes. Better to let the easy ones be fast and give them all good attention.
  • Once a mistake was made, it would take at least 6, usually more, correct answers to pass the set. Wrong answers were weighted much more negatively than slow answers.
  • A mistake seems to set your progress bar for the exercise down to at most half-filled.

On answers:

  • Their scratch pad interface is a nice idea, but it's impractical to draw/write things with the mouse if you need to be fast. I just used a pen and sticky notes when I wanted scratch paper.
  • Allowable answer formats varied somewhat inconsistently. Fractions almost always had to be in reduced form, but not always, rounding varied from 0 to two decimal places, and these were across exercises that had nothing to do with reducing fractions or rounding decimals. It was worth checking their little "acceptable format" indicator to see what types of input was allowed.
  • The answer parsing was reasonably flexible. For example, when entering multiples of pi you could write "5pi" or "5 pi". It was flexible enough to not be annoying.

Final Thoughts

The Khan Academy exercises are a good source of practice material. They are not, however, a learning resource and do not replace real homework assignments and feedback by teachers. But I think they're solid supplementary material. I also think they would help someone who was once comfortable with material and wants to re-learn it.

I think Khan Academy did a good job making the exercise format pain-free and enjoyable.

Thoughts on "Changing School Mathematics"

  • By Brad Conte, September 21, 2012
  • Post Categories: Math, Reading

I read an interesting essay about the "New Math" effort from several decades ago: Changing School Mathematics by Robert Davis (about 12 pages long). The essay gave an overview of New Math but focused on the Madison Project, a specific effort at implementing New Math, and how it sought to change the way students learned math from the old style of following formulas to a new style of emphasizing conceptual understanding. The essay covered some interesting techniques of teaching and how students responded, and I some of the thoughts from the essay really resonated with me.

The project was set up to address some shortcomings in the popular math teaching of the time and the difficulties that math education faced. One of the biggest problems in teaching mathematics, particularly at the primary education levels, is that it is both difficult to convey good understanding of mathematics and to learn if the student has gotten a good understanding or if they are simply memorizing and repeating steps. Much of the usefulness of math stems from understanding what is happening and why it is happening and a student who doesn't understand that is missing out on the majority of what learning math has to offer. So there has been a lot of debate and experimentation about the "right" way to teach math and what styles are or are not effective.

I liked a lot of what the author had to say about teaching math, and I think a lot of it applies to learning and teaching in general. A lot of it resonated with my inner mathematician (I'm a math major) and reminded me of ways that I like to learn and ways that I think are effective in tutoring other people in math. (Teaching math is a subject I'm very interested in right now, as I think about starting to teach my children math in the next few years.)

Here are some interesting quotes from the article, although I recommend reading the whole thing. I think that a lot of it applies to most learning topics in most learning scenarios, not just math in a class room. (Emphasis in quotes is preserved.)

No child can be expected to "discover" historical accidents or what is in the teacher's mind. Only after a task is clearly understood can the creativity and inventiveness of children take over the agenda. The correct understanding of the [Madison] project's approach might better have been stated as, "If, at this moment in the lesson, what is needed is an intellectual breakthrough of some sort, please wait and let a student take the first step." If you wait, someone will.

- pg. 637

I like this point because I'm very convinced that we both understand and remember things better when we make the connections ourselves. It's certainly possible to feed people answers prematurely, either for the sake of keeping up a pace or because we're too excited ourselves and we want to share our knowledge, and in doing so we hinder the listener's ability to play with key pieces of the puzzle themselves. Providing an answer too soon can deny the listener a "light bulb" moment, which is key for understanding ideas.

When I'm explaining an idea to somebody, perhaps one that itself involves several different ideas, there will inevitably be a time where a few ideas come together and the listener doesn't instantly make sense of everything. (This arises frequently, almost whenever a relatively deep conversation has gone on for 10 or so minutes.) They're not dumb, just taking a moment to sift through things. When I feel like this has happened, I usually like to just go quiet for a moment and let the other person think. Even if they ask a question, I may just ignore it for a little bit (politely, of course, perhaps with a little "hm..." and a pregnant pause). The idea will make more sense to them and be more useful if they put it together in their own head, and I'm usually certain that nothing I can say will substitute for their own effort.

Besides demonstrating the assimilation paradigm, the preceding example shows another way in which the project provided help to students: the use of clear, unambiguous language and notations. We had been aware that David Page was using small raised symbols for positive and negative, as in +2 or -3, and carefully using the words positive and negative when that was the idea (as opposed to plus and minus in the situation where those meanings were intended and nonraised symbols were used). We had not chosen to follow his example in this usage until some seventh graders, asked to invent a sensible way to add, subtract, and multiply signed numbers, responded that

(+2) x (-3)

should be equal to 3. How come? "Because two times three is six, and then you have to subtract three." We converted immediately to Page's notation, and this difficulty disappeared.

Probably, if we were telling the students what to do, alternative interpretations such as the one above might not arise --or, at least, might not see the light of day and might not come to be noticed. But if the children themselves are building up the mathematics, if they are inventing ways to proceed, then exactly how they are thinking about the ideas becomes directly relevant. In traditional "teaching by telling," the question of which notation is used may be seen as unimportant.

- pg. 640

When a teacher is emphasizing a conceptual understanding of math, clear notation is necessary. This example looks like a very interesting case where seemingly simple notation obscured the math problem. The student tried to derive a mathematical meaning from the notation, but the notation was imperfect. It was a simple example, and the author points out that this is easily solved by just telling the student what to do (they will eventually memorize the rules through repeated practice), but it illustrates how notation can be an artificially imposed difficulty.

I think that good notation should distinguish between what something is and what actions are happening. There is probably a more formal way of expressing that thought, but that's my basic idea. In the example above, the student confused what quantity existed (negative 3) for what action needed to be done (subtracting 3). Obviously, the two ideas are closely related, but technically the notation (aka, "-3") was ambiguous by itself, the intent had to be inferred from the context, and the interpretation mattered. (This last point is important because, particularly in higher math, ambiguous notation may be permitted when various interpretations/definitions are equivalent, so the reader is free to choose the interpretation they like best.) Such seemingly simple things can, at a minimum, create confusion, and once the student is confused their progress will be slowed down. I don't really see a need for such ambiguous notation, and in an ideal system we wouldn't have it. Since notation is mostly established by popular tradition, it's no surprise that odd quirks like this exist.

The essay closed with:

By imagining that mathematics means knowing when to invert and multiply, we have come to trivialize mathematics, knowledge itself, and even the nature of human thought. Anyone who listens carefully to what children really think about the world will know otherwise.

-pg 645

Another re-iteratation of the thesis: Students, even young ones, can understand ideas and find motivations for those ideas on their own given sufficient guidance. Rote memorization of using tools doesn't teach the same skills.

Math can be a difficult subject to teach. One reason it's hard is because a good understanding of it is generally a very "internal" feeling. Mere words don't really convey what a person understands about math, much like words fail when trying to describe a piece of music. Differences of viewpoints, preferred approaches, and mental strengths between teachers and students can raise barriers when trying to take knowledge from the teacher's head and get it into the student's head. But it's not an impossible task, and there are definitely some techniques have tend to produce better success rates than others. Good teaching will allow the student to build math understanding in their own head, instead of cramming a pre-defined structure in there. The Madison project tried to accomplish this, and it seems like they had a lot of good ideas that students of any generation would benefit from.

Reducing the Size of Device Images

While device images can be helpful to keep, they can be a pain to store. They must literally contain every byte of the original device, making them about as big as the original image. However, saving unused disk space is pointless. By zeroing out unused filesystem space and then using fast compression on the resulting image, the compressed will be about the same size as the used data on the filesystem. If a significant portion of the filesystem is unused, this can save a lot of space in final image.

Making a device image is not difficult. Unix-based systems have long had the "dd" utility. For Windows, Vista introduced a built-in utility for creating image backups. There are also many third-party software applications like Partition Magic provide this sort of functionality.

The biggest disadvantage of a raw device image is that space unused by the filesystem is still saved. If a 20 GB file system only has 4 GB of data on it, all 20 GB will still be saved in the image. If there are many images that need to be stored (such as images from multiple devices, or images from multiple points in time from the same device) this isn't very space efficient. Users with space concerns often compress the images, but the unused space often contains normal data since it is likely that it was used at least once in the past as files were created, deleted, and moved. So, unfortunately, the unused portion of the filesystem usually compresses only a little better than the used portion. Since the contents of the unused portion of the filesystem are often arbitrary, it is undesirable to have such high overhead for storing then.

(Note that filesystem based images do not have this problem. But these are more complicated and do not work on full disks, which is necessary to preserve the MBR, partition boundaries, etc.)

However, if the unused space of the file system is filled with zeros before compression then the image containing the filesystem is compressed the unused space will be trivially compressed to almost nothing. Compressing 4 GB of data is practically the same as compressing 4 GB of data and 16 GB of zeros.


Zeroing out unused filesystem space is simple. First mount the file system to be imaged, then create a temporary file on the file system and fill this file with zeros until either a) the file system runs out of space, or b) the file system's maximum file size limit is reached. In the case of the latter, continue creating and filling temporary files until the file system is full. Delete all the temporary files once you are finished. At this point practically all unused space has been allocated for one of the zero-filled files that was created, and thus has had physical zeros written to it.

On a Unix/Linux system, the dd utility makes this easy. The following command:

$ dd if=/dev/zero of=/my/file/zero.tmp bs=16M

reads from virtual device /dev/zero which supplies and unlimited quantity of zeros and writes the zeros to an output file, automatically terminating when the file can not grow anymore. The argument bs=16M is included to speed up the operation, since by default dd will read and write in chunks of 512 bytes and the constant switching between read and write operations in very inefficient and can make the process take tens of times longer.

I've written a quick platform independent C++ program that will create files full of zeros until the files are as large as they can grow and no more files can be created. While "dd" is certainly more convenient, this should work on Windows systems and on filesystems that don't support sufficiently large files. Execute this program with one argument pointing to a path on the partition you want to zero-ize, no argument will default to the current working path.

Obviously, it may not be a good idea to perform this zero-ization operation on a filesystem that is in active use. After the filesystem is filled, but before the temporary file(s) are deleted, there will be almost no room to write data to disk. While this will be a very small window of time, any applications (including the operating system) that need to write to disk will possibly be denied the ability to do so and since it is rare for applications to be denied write access to open file handles, their behavior may be unpredictable. In the majority of my own personal tests I have not encountered a problem, but a couple times the system froze or slowed down noticeably until I deleted the temporary files. Just be careful, filling a live filesystem to the brim is not standard good practice.

There are at least couple technical issues that prevent the above "files with zeros" approach from completely zeroing out unused space on the filesystem.

  • Partially-used sectors will not have the unused portion zeroed out. But these sectors will represent a negligible percentage of the total disk. They typically occur as the last sector of a file that isn't an integral multiple of the sector size. From quick empirical testing, a typical Windows 7 install will probably have less than 500,000 files, and 500,000 sectors means about 125 MB of space on average.

  • Writing to and then deleting a file does not guarantee the file to be written to disk due to both the OS/filesystem and disk-level caches. Cached data that never gets written to disk will be abandoned when the file is deleted, and the space on disk it was supposed to zero will be left untouched. But only a small portion of the data to be written will be cache-able by the OS and the disk. Home disks rarely have larger than 32 MB caches, and the OS/filesystem will likely cache at most a gigabyte or so. This has the potential to be non-negligible size, but even an aggressive cache would have a small total impact. Since so much is being written to disk, the caches will overflow quickly and be forced to write most of it to disk.

  • While it should be obvious, this is just a "good enough" approach. We probably don't care about an extra couple hundred MB of space in the image (and we're definitely not relying on this for security).

    A fast compression scheme is probably better than a good compression scheme, unless time is not important. It's likely that the majority of used space will be binary data (executables, already-compressed media formats, etc) and will yield very low compression no matter how hard you try. Any simple compression method, like GZIP, will be able to make efficient use of the sections of 0s and not waste too much time compressing the rest of the image.

    Concluding Notes

    The difference in compressed image between the original device contents and the zero-ized device contents will depend on the filesystem(s) involved, how full they are, how much data has been added and deleted, how long it's been since the last time the device(s) were wiped, and similar factors. However, since this is a fairly easy procedure, it wouldn't hurt to try this if saving device image backup space is helpful. In my personal experience, I've seen the size of the compressed image as much as halved. On a device where not much data is copied, this may only need to be applied once or twice in the lifetime of the device to keep the majority of the unused space zeroed.

    Somewhat obviously, this technique should not be used on a device that requires forensic analysis, as sectors unclaimed by the filesystem may still have contents that need to be examined.

A Letter to My Congressmen Regarding SOPA and PIPA

  • By Brad Conte, January 18, 2012
  • Post Categories: General Tech

I wrote my three congressmen today to voice my opposition to a well-known pair of bills that are under consideration by the United States House of Representatives and Senate (respectively), namely SOPA and PIPA. These bills were drafted with strong support from the multimedia industry and they bring a very heavy hand into the legal realm of copyright enforcement and are very unpopular with Internet-based companies and most Internet users in general. As I write this, many websites are in the middle of a day-long self-imposed blackout in protest.

I strongly oppose these bills. I do sympathize with the fact that multimedia companies have legal and moral rights to exercise over their content and that these rights are violated by mass-piracy, but these bills take far too drastic action to protect said rights. I won't re-iterate all the reasons why these bills are bad, even in spite of recent tamer modifications.

So I wrote my three congressmen today to voice my opposition. I'd like to share the letter publicly, as a wider proclamation of my position on this issue and hopefully as an aid for anyone writing their congressmen on a political issue. It's not a master-piece, but I thought it might be helpful. I'll explain some of the reasoning and structure below. Here it is:

To the Honorable [insert congressman's full name],

I would like to add my voice of support to the millions of people who oppose [SOPA/PIPA].

I am sympathetic to the motivation of the bill. I understand that multimedia companies have proper motivation to protect their intellectual property. I support their moral and legal rights to own their content.

However, I do not believe that their efforts to protect their property should be at the expense of humanity's convenience and technological development. We the people do not exist to listen to music, we listen to music while we live our lives. Similarly, technology does not exist to play music, it exists to enable us to do what we want, some of which is to listen to music. Legislation like [SOPA/PIPA] takes a multimedia-centered viewpoint of the universe, assuming that we should hinder productivity and change the dynamics of an entire industry just to protect the convenience of one sector of that industry.

The multimedia industry has a history of resisting technological development due to their reluctance to change business models. They tried to legally combat cassette tapes, video tapes, and CDs under the pretense of fighting piracy that would hurt them. Yet those very mediums enabled them to distribute their content in better, more widely-reaching ways than before. They resist change, yet change and progress is what technology, and humanity itself, is about. Every sector of an industry faces critical changes over time, and while it can be difficult for them to adjust they should not seek legal aid pass their difficulties onto us, the common citizen. This is a capitalist society, the multimedia sector needs to adjust, not be pampered.

My request, congressman, is that your position be to protect the progress of the general public. I support the multimedia sector's desire to protect its content - once again, I sympathize with their motivation - but not at the expense of all our development. Their history and the current [SOPA/PIPA] legislation show that they are not seeking to provide us with multimedia enjoyment while we live our lives, they prefer to limit our lives to fit their existing business models.

Thank you for your time. For what it's worth, I voted for you. To protect me.

--Brad Conte

Some thoughts on this letter, in no particular order:

  • I used SOPA or PIPA where appropriate for the recipient. One bill is in one house, the other bill in the other house. Writing a blanket statement like "SOPA and PIPA", or worse yet just the popular one "SOPA", sounds more like a form letter. Who wants to get a letter that mentioning a bill that they can't even vote on?
  • I wanted to paint a big picture perspective. 1) I gave a very quick history of similar actions in the industry and their outcomes. 2) I noted that the multimedia industry is just a sub-sector of the larger industry that this bill effects. 3) I remembered international concerns; the U.S. is the country most impacted immediately, but this likely has implications for the whole world, hence I slipped in the word "humanity" a couple times. The goal wasn't to be dramatic, but to always remind them of how wide-reaching the implication would be. The overall point was that this was far too invasive an action to protect one specific sector.
  • As a general rule for opinions and debates, you should always separate general intent from specific implementation and you should separate analysis of the consequences from debate over what the consequences are. If you want to get your point across to someone, decide which perspective you're arguing and make it clear, too many conversations are completely wasted due to a missing of the minds on those simple topics and jumping confusingly between perspectives. In my letter, I opposed this implementation of copyright and laid out the consequences for what happened. If the congressmen doesn't agree that the assessments that I asserted, that's the subject for a separate, much longer e-mail. I would suggest that it's best to go for lots of detail or very little, because few things are as weak as an argument that uses just a few details strewn about.
  • It's easy to read - about 370 words and about 2/3 of a printed page, so easily readable in a few minutes. There are 5 paragraphs (the closing line isn't really a paragraph), but only 3 have the bulk of the content; it looks very digestible at a quick glance. That's all you need to give a summary of a position. This makes it easily skim-able: a) while each paragraph is unique, you could omit any one and get the majority of the argument, b) you could read the first sentence of each paragraph and still get the message (that's actually a good rule in general), and c) you could read the first and last paragraphs and understand it. Given the popularity of this topic, it's likely that if the congressman is reading my letter, it is just one of hundreds on the same topic.
  • I emphasized what impact my position would have on the opponents (the multimedia industry). Actually, I somewhat trivialized their position as simply being for "convenience". It isn't a matter of preserving the multimedia companies, it's simply about them finding and adjusting to a new business model, just like they've done several times in the past. It's important to know what's at stake in a situation like this, and I wanted to contrast humanity's technological stifling against their convenient business model. (In retrospect, trivializing it as "convenience" was a pretty strong statement to make without supporting evidence, maybe it should've been a little tamer.)
  • Similar to the above, the secondary focus of the letter was on priorities. "We the people" are not befitted in general by this bill. I said and implied that a couple of times.
  • The tone is unemotional, yet a little grave. It's a serious subject but it won't kill my grandmother, so there's no need to sound like it would.
  • The closing line ("...I voted for you. To protect me.") might come off a bit snarky, but it conveys a valid point. Congressmen are put in place by "the people" and those people are who will lose (and lose very badly) should the legislation pass. Obviously there are times where a congressmen needs to put aside the individual or short-term benefits of each person for the big picture, but I wanted to remind them that my hope is that their voice will do its best to echo ours. If I would be strongly opposed to the bill, that should count for something. I didn't make any threats (ie, "I won't vote for you if you don't oppose this"), I just reminded them of established fact. And, for what it's worth, I did actually vote for the congressmen I wrote to. (I shouldn't have to point that out, but it's an easy statement to lie about and I've seen it done elsewhere.)

Hopefully the letters will do some good. (Update: For what it's worth, I only received back form replies from a couple of the congressmen.)