Work and Workings of a Nerd

A personal blog about what's on Kevin's mind.

Archive for the ‘ technology ’ Category

How to Be a Secure Computer User (Within the Bounds of Convenience)

Tuesday, February 21st, 2012

It’s an old paradox to wonder why there are any doctors that smoke. It’s somewhat hypocritical, but mostly just confusing that the people who know the most about the effects of smoking should partake in it as well. I would like to say that I know better, but until a few days ago, I didn’t. Until then, my passwords for computer accounts and online services were all stored in plain text in a GMail draft. Were you to know that you could find it there, it would be very easy to steal all of them.

Using computers securely is a big battle for experts and users, and the best practices really depend from person to person. Every person needs to decide what their tradeoff between security and convenience is. On the completely secure side, one can remember long, random, unique strings of characters for every account. This, however, is extremely inconvenient to remember. On the completely convenient side, one can use “password” for every password. This, however, is extremely easy for hackers to break. Everyone’s practices lies somewhere in-between, using known methods and a little bit of personal construction.

That personal construction, however, is where we really get ourselves into trouble. Security is very hard, and most methods of compromising aren’t good. For example, I know that I should have long, unique, hard-to-guess passwords for everything. Because I couldn’t remember that, I decided it would be okay to record them in an easily accessible place for me. Unfortunately, that happened to be a really awful practice.

Most people aren’t security experts and don’t know the best way to use computers securely. Given that, I think that a lot of bad practices can be rooted out with a quick 2 question quiz:

  1. If a hacker knew where you store your passwords, could they guess your passwords easily?
  2. If a hacker knew a subset (none, one, a few) of your passwords, could they guess any of your other passwords easily?

The first question mostly deals with people securing their passwords by obscurity. This is mostly how my old system worked, and it’s a bad idea. The United States National Institute of Standards and Technology (NIST) thinks it’s a bad idea. It’s unlikely that you’ll come up with a method on your own that no one else has thought of, and if you’re borrowing someone else’s system, you’re already screwed. And even if you hide things, they can be found easily.

The second question deals with bad security design. In a world where you create passwords for many different services, some of which are well-built and some of which aren’t, it’s not hard to imagine that one of your passwords might be leaked to a hacker. That itself is somewhat unavoidable. If knowing that password, however, allows them to determine the rest of your passwords (either as a straight copy or by design), you’re in trouble.

Let me walk through a few common methods I’ve heard from friends recently (and a few trivial examples) and point out the possible problems with each of them:

  1. Use “password” for all passwords. This password is very common and not hard to guess. By question 2, a hacker could guess your password knowing a subset of size 0 of all of your passwords. This is an awful idea, as you can imagine.
  2. Use a small set (maybe 3) of passwords for everything. This is also a problem for question 2. If a hacker gets 1 of your passwords, they can now guess 1/3 of your accounts fairly easily.
  3. Store passwords in a (unencrypted) document on your computer. Even if you name it something other than “passwords.txt”, it’s not hard for a hacker who gets read access to your computer to find the file and copy it. At that point, this method fails on question 1.
  4. Start with a base password and modify it slightly for each service. For example, Google might be “abc123GoogleRocks” and Facebook might be “abc123FacebookSucks”. This ensures that each password is unique and somewhat long. Unfortunately, this is still a problem for question 2 because other passwords are deducible from a single password. In the example above, even though “Rocks” and “Sucks” are different suffixes that you can remember, it’s still a systematic method that doesn’t ultimately leave that many possibilities.

As you might have guessed, some of the above are better than others, but assuming you’re okay with the level of security each will give you, they’re all fine if they work for you. You should just be aware of the risks associated with it.

Given all this talk, I need to support my claims, so my new method is using a password manager (specifically, 1Password). Essentially, I have a single master password for an encrypted database that stores all of my other passwords. With that, I only need to type in my master password, copy the specific account password into the box (or use the auto-fill feature), then lock my manager again. Like other methods that aren’t memorized, random, long, unique strings, it’s not perfectly secure, but it’s good.

It isn’t susceptible to the flaw of question 1. Because the database is encrypted (using tested methods), it is presumed to be secure*, so even though I have announced that I’m using 1Password and the file is not hidden on my computer, a hacker shouldn’t be able to get my passwords without knowing my master password. Question 2 is a problem: if a hacker knows my master password, my world is open to them. Otherwise, my passwords can be arbitrarily complex.

The caveat to all this is that I trust that the creators of the password manager are honest people using secure methods of security. If they’re sending all of my passwords out to their secret server, of if they screwed up the implementation of some security protocol, I’m in trouble. But that’s the line I draw for myself between convenience and security: I believe that my master password cannot be guessed and that 1Password is honest and secure, and this is the furthest I’m willing to go to be secure.

So maybe a password manager is a solution for you, or maybe it isn’t. I just wanted to write about it since I had thought about it so much recently and think it’s worth it for everyone to evaluate the ways that they are being secure. Again, it’s a tradeoff between convenience and security. Just be aware and comfortable with the consequences of your method, keeping in mind these 2 questions:

  1. If a hacker knew where you store your passwords, could they guess your passwords easily?
  2. If a hacker knew a subset (none, one, a few) of your passwords, could they guess any of your other passwords easily?

* I say presumed because even security experts don’t know if any construction is completely secure, but it’s the best that anyone knows. The limit of that is whether P=NP, for the CS literate among you, so it’s pretty certain.

If We’re Moving Classes Online, Where Are They Leaving From?

Sunday, January 29th, 2012

There’s been a recent trend of moving opening up coursework to the world on the internet. The first steps towards this happened a few years ago with resources such as iTunes U and OCW, where recorded lectures and notes were made available online. This trend, however, has taken the next step as university courses in their entirety are being offered online, with graded homework and a certificate to boot. I’m mos  t familiar with offerings from Stanford, but I think others are jumping on the bandwagon, too.

The reasoning behind it is sound. A lot of coursework is moving online anyways as convenience for students, and it seems like the right thing to do. I’m fortunate enough to be a student at a well-funded private university, but that luxury isn’t available to most people. With the popularity of this model from sites such as Khan Academy and the pipeline to do it, it seems almost unfair to restrict the content to the few who can afford it.

It’s not perfect. For many classes, there’s no replacement for the ability to work hands-on and in-person for a class. You lose the physical environment of a college campus and the ability to collaborate with instructors and other students directly. Some argue that the most valuable part of college is not the class but the people you meet. If you can’t have that, though, this is pretty close. The traditional teaching model with hour-long lecture and problem sets don’t really involve interaction to a large degree.

At the risk of sounding privileged and snooty, however, I’m a little disappointed by how this change is being embraced in class design for in-person students. Since this was an initiative in the Stanford Computer Science department, I’ve taken several classes (and am currently taking a class) that are now in the online format. To accommodate the students, they’ve made some changes to how the class is taught, both for online and in-person students. And I think they’re a little worse for it.

Most professors really do just lecture, and a recording is no different. I actually depend heavily nowadays on watching lecture online to work with my schedule. But some professors are really good in class, and it’s a shame to lose that. For example, Dan Jurafsky is teaching an online class on Natural Language Processing. I took the class from him junior year, and the lectures were great. He actively pushed students to think in-class, ask and answer questions, and just made it a really fun environment, even though he did teach by lecturing from slides. Even though the class was early in the morning, everyone was awake and could really be a part of things.

That class, however, might be an exception. The other substantial change that I’ve noticed is a change in the workload for students. In order to make classes available to possibly thousands of students, grading must be automated: it isn’t practical have TAs go through all the assignments by hand. This model isn’t really scalable for, say, English classes, where most of the work is discussion and essays. But for many technical fields, where work boils down to getting the right number at the end of a derivation or writing a program that computes the correct output for some specification, it might work.

Having gone through these classes, however, I think that might be cutting students short to some degree. Let’s take Computer Science as an example. Past the 3 course introduction to programming series, most of my work has been tailored towards proofs and conceptual understanding. Once you has a sense for how to program, actually programming in classes becomes much less relevant: if you need to implement a program, you take a few days and learn the specifics of the language. The real trickiness comes in understanding how to build systems, which requires conceptual understanding to compose and extend systems.

Take, for example, Probabilistic Graphical Models. Roughly, this class on artificial intelligence is about modeling phenomenon using probability. This class is historically known to destroy students. In the past, there were biweekly problem sets involving derivations and proofs for 5-ish problems, often the results of research papers in the field. TAs were responsible for grading these lengthy, often page-long proofs involving a mix of mathematical derivations, cleverness, and intuitive explanation. This syllabus, however, needed to be switched up, so it’s now weekly programming application projects.

So admittedly most of my response to that is bitterness at having been brutalized by the problem sets for this class while current students can just write a little bit of code. But it seems to me that students now aren’t getting the same depth that they would have thinking through problems. For many advanced algorithms, the actual implementation is relatively easy when it’s already outlined: it’s basically just translating a description into code, which often doesn’t require much insight into the algorithm itself. Trying to re-derive the same result or prove a similar concept, however, is much more difficult and requires understanding of how things work. I don’t need a class to teach me whether I did something right: I need a class to teach me whether I understand.

We’ll see how this shift goes moving forward. I’m all for providing content online as long as it doesn’t displace valuable aspects of current teaching methods. By focusing on the most tangible products of coursework, such as lecture content and quizzes, we might lose out on more subtle parts of an embodied, learning experience. Let’s democratize education as best we can, but don’t sacrifice the vitality of colleges while we’re at it.

Using the Web to Make Academic Work Useful

Thursday, January 26th, 2012

For several stretches of several academic school years, I have allowed my class-related work to become my blog content. Sometimes it’s more natural, such as the final essay for Creative Nonfiction, and sometimes it’s less natural, such as short critiques for Moral Philosophy. Most of my motivation for posting this work is pure laziness: it’s really hard to will myself to write a blog post after having worked through an essay. A smaller point, which is the crux of this post, is that it seems a shame that I should spend so much time on classwork that will ultimately be seen by only one grader.

Not all classwork is valuable beyond its own context. Mechanical math problems and proofs of known results are obvious examples of classwork for its own sake, so I hope you won’t be offended if I avoid posting addition and multiplication worksheets. A lot of other classwork, however, emphasizes critical thinking, synthesis, and creativity in research and projects.

I’ve tried to make most of my original and less embarrassing writing available on this site, either in blog posts or on my Writing page. Despite its pedagogical purpose, classwork can still be original and contribute to knowledge as a whole, especially given how sparse some of the content may be. For example, Google Analytics tells me that my essays and responses for Moral Philosophy are some of the most popular content from google searches on specific philosophers and philosophies. Posting this content is cheap and easy for me, and it may be extremely valuable to anyone else who ends up researching similar topics to those papers.

Even so, most of my work has been relatively simple, and I have often been frustrated by how difficult it can be to find similar resources for some actual published papers. Many researchers have released open frameworks for their work, but more often than not, the details aren’t available. Datasets, stimuli, program code, and all sorts of other work are poured over by researchers for months and sometimes years, yet are basically forgotten after being summarized and presented in a paper. I’m not a full-fledged researcher and don’t understand most of the logistics, red tape, and politics that probably drive most of the reasoning behind the process, but in the pursuit of knowledge, it only seems right to make as much known as possible.

Along those lines, I’ve taken that first step and released several of my projects on GitHub, where you can view much of the code for research I’ve done, along with some results and write-ups. As you might expect, the code is something of a mess, though should anyone want to use or understand it, I would be happy to clean it up. In all likelihood, the repositories will likely sit on the web, unseen and unimportant, but for how much I complain about not being able to find things, I can at least say that, “I tried.”

For many of my peers who have also worked on various projects, I recommend that you do the same. I’ve seen some really impressive work come out of class projects, and it would be a shame for that to be the end of it. And use it for current projects as well. Should you be doing any coding or research, you should be using a version control system anyways, so you might as well make it publicly available as well. In academia, we’re always all collaborating with everyone.

Why We Don’t Need to Worry About Robots’ Rights

Thursday, November 3rd, 2011

Last Thursday, I went to a panel discussion being held at the Stanford Law School by The Center for Internet and Society on “Legal Perspectives in Artificial Intelligence.” My mind is mostly buried in the AI, but since I have recently become more interested in policy in general and the social impact of technology, I thought it could be interesting to see where the crossover is.

There were a lot of possible intersections, such as the use of AI in assisting lawyers put together cases or IP rights to AI code and programs. The topic they mostly discussed, however, was the possibility of AI being considered a legal person and what the implication of that was. It was an unfortunate angle to take because AI equal to a human doesn’t exist, so it was mostly non-answers, roughly of the form, “Interesting; we’ll see what happens.” They also chose not to jump into the philosophical aspects too much, with only minor discussion of philosophical zombies (a being that behaves exactly like a human but has no consciousness), and instead left those as largely open questions as well.

Disregarding how unsatisfying those answers were, I was also disappointed by the conversation as a whole because I found their conception of AI somewhat narrow, and that limited the topics they could consider. Instead of considering the state of technology as it is today and the issues surrounding that, they mostly clung to the more fantastic view of AI. This view, perpetuated largely by popular media, is best represented by robots like C-3PO that are human in all except form. More generally, this view treats AI as a system with intentions, self-motivation, and more psychological properties similar to humans. And that AI doesn’t exist.

Stepping back from that, however, and we already have some forms of AI, and I will make the stronger claim that what we have now will be the form of AI for the foreseeable future (with respect to legal rights; of course we’re making great progress in the nitty-gritty). So, I think that this panel was appropriate for us now, but for different reasons than what the organizers likely considered. AI is here now and it has plenty of problems surrounding it. For better or worse, though, I think it’s largely invisible in our lives. Let me give a few examples of AI in our lives, what its role is, why I don’t see it changing into HAL, and what the legal implications of it are.

First, AI in the market. The panelists discussed the legal status of AI as a trustee, an advisor to a trustee, or as a business operator. I don’t see this coming soon because AI don’t have their own desires, so it doesn’t make sense for them to be in charge. AI can be a tool to make recommendations and crunch the numbers, but the last mile will be all blood and guts. And this form of AI already exists. Take algorithmic trading: a computer is executing trades for a fund or some other trader based only on the numbers and often faster than humans are capable of. On the whole, it’s a black box. Very smart physicists and computer scientists can build models to make it run, but once it’s going, it’s past our ability to actively monitor it. Just last year, the Dow Jones crashed, which was largely blamed on algorithmic trading. The SEC ended up changing some rules based on this, so this is a problem being dealt with right now. I haven’t followed the situation, but I imagine that there are questions about liability when AI runs havoc on the market.

Second, health care. This came up in the discussion of Watson, the Jeopardy playing AI that IBM claims it wants to retool for health care. They were concerned about the possible issues here, as health as least as touchy of a topic as the market. I’m not scared about it, though, because we still have doctors. Doctors may receive advice from computers, but the final decision is going to be in the hands of a human. We don’t send our brightest to school for a decade just to let them defer judgement: they’ll still sit between a patient and AI. Even so, this is again already happening today. In fact, we apparently even have a journal dedicated to this topic. As it is, we can use probabilistic models to diagnose various illnesses by telling a computer what the various symptoms are, and it’ll spit back the likelihood of various possibilities. AI researchers will tell you that they’re actually better than doctors since they have the accumulated knowledge of many more cases than any 1 doctor could ever know. Importantly, AI here is just a tool, not a legal person. We do have questions today, such as patient privacy when the data are being aggregated into a single machine, and these will be the questions moving forward.

To wrap this up, let’s bring this around to an example of AI that you must be familiar with to be reading this: web search. On the surface, it seems like this is a task that humans are performing, any non-trivial search engine you’ll encounter has all sorts of interesting AI in it, such as trying to figure out if you meant the scooters, the mice, the hygiene product, or the phone when you type in, “Why isn’t my razor working?” The net result is that people usually click on the first link, which means that we’ve already deferred a lot of our choices to AI in picking the “best match” to our search terms. But that’s a far cry from R2D2, and hopefully, no one will ever sue a search engine for giving them bad results.

And it’s everywhere else, too. Google translate, autonomous cars, Bing flight search, Amazon search recommendations, and Siri are all examples of what AI really looks like today, and frankly,  it’s not that scary. None of it may sound that impressive or very AI-like, but that tends to be a funny problem with AI as a field of research: once we figure out how to do it, it’s not AI anymore.

I think it’s important that all of those things I just rattled off are tools, not independent agents. We build things that we want, and for the most part, we want things done for us while leaving us in control. This means that we build wonderful systems that use AI to make our lives easier, but that last mile is still human.

Given that this is what AI is and what it will be (so I claim), then the issues are already in front of us now. And if they don’t seem like issues, it’s because they aren’t. Do we worry about incorrect diagnoses from AI? A doctor may blame a computer, but it’s still the doctor’s call. What about an autonomous car getting into an accident? Assuming it’s entirely autonomous, it’s no different than trams that have a preset schedule. Cars aren’t going to have their own desires (such as to tailgate the jerk who cut them off), and since we’ll understand how they work, the mystery is gone.

So in summary, AI is here now, and it’s as it will be. There are legal issues to consider with respect to AI, but we shouldn’t be worrying about AI as a legal person. And appreciate and understand how important AI already is in your lives. As tools.

How My Desktop Follows Me

Sunday, October 16th, 2011

In my last post, I mentioned that I have been transitioning out of a single machine mindset into having everything at hand with terminals where I go. I have largely stuck with it for the last month and a half, and it works very well. My backpack is oddly out of balance now that I’m mostly carrying around little items like my iPod Touch without the monster.

In any case, I promised a follow-up post where I discuss how I’m managing to do it. Here’s the list of computing tasks and needs that I have and the services that have me covered:

  1. stickies, random text documents, and other notes: Evernote. Until recently, I was very dependent on the stickies on my computer. It had my todo list, various details of interest, important addresses, and anything else I needed off-hand. I couldn’t survive without it. As evidence, when my motherboard got fried and I didn’t have a computer for a few days, the most important thing I needed to get off my external backup was my stickies: all other documents, music, code, etc were secondary. Beyond that, my “Documents” folder on my computer was also largely random notes and lists. All of these transferred cleanly into notes in Evernote, which syncs these to the cloud and offers desktop, mobile, and web integration. Now, without my stickies, I can’t live without it.
  2. music, podcasts: Pandora, Spotify, iTunes syncing. There are too many good services online nowadays to stream music, and since their libraries are mostly bigger and better than mine, it works. Podcasts are still dependent on having my one iTunes account on my machine to keep track of which ones I have listened to, but since I can sync it to my iPod, they’re with me everywhere. Recently, I put a pair of speakers in my kitchen and have been listening to podcasts while doing dishes and cooking. So, this is even more portable than the setup I was fixed in before.
  3. movies/media: Netflix. I never did have many movies or TV shows, so netflix streaming is another improvement. I like to think it’s not completely down the drain paying for the service, since it is saving me from buying another season of 30 Rock on DVD every year.
  4. documents: Dropbox. But to be honest, I don’t really use documents anymore and haven’t really used dropbox. Weird.
  5. bookmarks: Chrome syncing. Without this, I could have never made the switch, but between the computers I use regularly, I’m largely in exactly the same state since I spend so much time in the browser. This, along with Google+ (as discussed here), stopped my delicious usage almost entirely. Instead, I can keep a “To Read” folder on my bookmarks toolbar and leaves interesting but long sites in there.
  6. calendar: iCal. Someday, I might switch to Google Calendar, but since it syncs up to my iPod, this has worked out fine.
  7. email: gmail. I haven’t used a desktop mail client since getting my gmail account 7 years ago.
  8. video games: no solution needed. My games are only on my home machine, but that’s fine. I don’t need them anywhere else anyways. This also happens to be the only thing that I do that requires my computer to have any juice whatsoever. Since I usually manage to keep the number of open browser tabs relatively low, everything else runs fine and could on much less hardware.
  9. software development: ssh. I don’t do any development on my local machine: everything I do is on servers that I log into.
  10. homework: none. This largely sits on my home machine since this is where I do my homework, though I could just as easily copy them into my dropbox folder. The only downside would be that I wouldn’t have them appear on my desktop, so this is probably another relic.

Even for all these changes, however, I actually have been moving my computer pretty regularly. In my school year housing, my room and the common room are separate, so I end up carrying my computer downstairs to plug it into the TV whenever I want to stream any movies or sports games.

I don’t necessarily have any problems or need my life more fragmented, but let me know if you have any suggestions for other services that might be useful. I’m definitely interested to hear how others have transitioned with more cloud services or have compelling reasons for not changing.

My Desktop Everywhere

Wednesday, September 7th, 2011

Back in my high school days, I wondered why computing wasn’t more portable than it was. Although laptops were popular, they still weren’t that convenient. I myself had a 64MB flash drive that I carried in my backpack that I used to sneak prohibited files onto school computers. It got most of the work done, but it wouldn’t be too much worse to carry around an entire hard drive with me if it could maintain the total state of computer between home and school. This clearly seemed like the dream setup to have: carry around, say, a 1 lbs device that could be plugged into any terminal and give you the exact same setup on any hardware.

In between then and now, I found a pretty good substitute in my Macbook Pro. Coming in at 5.4 pounds, I could carry my life around with me between my dorm room, home, class, library, and wherever else I needed it. It was pretty ideal, and for awhile, I believed that the best way to live was with a single machine. No hassle with trying to sync multiple devices or reconfigure anything: just pick it up and go. My dad insisted that I have a laptop for when I went off to college, and I still agree that this was the easiest way to do things.

Having finished my 4 undergraduate years, though, I’m thinking that this is no longer such a good setup. First, it actually isn’t as wonderfully portable as it could be. When I travel, I have to switch out to use my laptop backpack, and it and peripherals dominate what i can carry. Second, it isn’t very comfortable to use. My current setup has it linked up to an external keyboard and mouse with the actual laptop perched on a board game so that the screen is closer to a comfortable viewing angle. Even better would be a larger external display, but at that point, I’m not using any of the built-in input or output functionality of the laptop.

Third, I don’t actually need to be that portable nowadays. Most of my work is done at desks, and I don’t need a laptop to work at a desk. I don’t go home that often, and the only other time I can think of that I really need a laptop is when I’m working with someone else on, say, a group project. Continuing on that note, my fourth reason is that I probably shouldn’t be carrying my laptop around with me anyways. For the first 2 years of college, I almost never moved my computer from my desk. I don’t think it’s unreasonable to say it was less than once a month. It was fine because I wasn’t distracted while I was in class. Beginning my junior year, I started lugging it around, thinking that I would be able to do some work at various times. Inevitably, though, I would find another good distraction and never get around to what I meant to do. And fifth and finally, the specs are very good for the price.

Even though the laptop doesn’t seem to fit my needs as well anymore, it’s okay because my original dream has been accomplished, though perhaps in a form I wasn’t expecting: the cloud. It’s a buzzword, but simply, that complete setup on any hardware exists and is even lighter than I thought it would be. Now, I can sit down at a computer, pull up a few online services, and have access to all of my music, documents, applications, files, and everything else I typically use on a computer. And thanks to Google Chrome syncing, my effective “desktop” is the same everywhere as my bookmarks, extensions, and other browser configuration follow me around.

I’ll get to describing the suite of services I use to do that in another post, but the conclusion that I’ve drawn from this is that I don’t need to carry around a physical object with me to make my computer use portable: I just need the internet, and other than airlines and fancy hotels (the cheap ones always offer free wifi; go figure), wifi is pretty much everywhere I would sit down to use my computer.

Still, there are situations where I would like to carry around a device with me, such as plane rides or trips home, and I’m thinking about getting myself a tablet (specifically, an iPad) to cover my bases there. In many ways, it doesn’t have good features that a PC would have, but I don’t think it needs it. My conception is that using a tablet is a fundamentally different method of computing, and I think I’m willing to take the plunge and see if it works when I get one.

In the meantime, I have stopped carrying my computer back and forth to work, instead leaving it at home as my “desktop” here and using another box at work. I couldn’t quite get away from the convenience of UNIX for development and am learning how to use Ubuntu on it, but in spite of a different operating system, I’m using it in almost the exact same way as my MacBook Pro sitting at home. As for portability, I’m carrying my iPod Touch around with me at all times now after basically leaving it untouched for the past 3 years. It’s limited, but it’s unnoticeable to carry and comes in handy in a few spots.

My transformation isn’t complete, but I’m excited to see how I adapt to this new setup over the next few months while I wait for another round of better hardware to come out. It may not be instantiated exactly as I imagined, but it looks like technology has snaked past the road bump of laptops and passed my dream of portability.

Fresh Server Install

Sunday, July 3rd, 2011

Due to some very poor decisions earlier, I managed to completely screw up the OS on this server, resulting in some downtime. Since then, I have decided to use a clean install, especially since my server was horribly configured anyways and probably needed a clean slate.

I tried hard to get everything put back together, but I probably missed something. Email/contact me if you notice any strangeness or problems with services from this server, and I’ll try to address them as soon as possible.

Installing Django on Snow Leopard (and MySQL and PIL)

Friday, January 29th, 2010

UPDATE 8/4/11: The instructions here are a little dated and difficult, probably from my own ignorance. With Lion out, too, things have changed enough that you should probably use the instructions here instead.

I recently needed to port a Django app off of our Linux server onto my Macbook Pro, which was slightly more involved than I thought it would be. I don’t remember having nearly so much difficulty with it when I initially set it up on my old Leopard partition, but times change. In any case, I figured it would be a common enough task that I would offer instructions on how to make it work. If you’re looking to just install Django fresh, you’ll notice there are a few steps specific to importing parts from another Django app, but it’ll mostly apply. So let’s go.

Here’s exactly what we’ll be putting together:

  • MySQL
  • Django
  • MySQLdb – a python package to communicate with MySQL
  • libjpeg – necessary for PIL
  • Python Imaging Library – a pretty useful library for drawing and more

If you’re more slick from Terminal than I am, there are a lot of steps here that you can do out of that. If that’s you, then you can translate those operations yourself. If you’re like me and still like GUIs for some tasks, maybe this will work better for you. I’ll also leave a trail of links to the references I used to figure it out myself so you can read the real reference

1) Installing and setting up MySQL

Download MySQL. Particularly, I ended up getting the 32-bit version for reasons that are not clear to me, though the 64-bit version might have been the correct call. I at least know it works with the 32-bit version. There’s a .dmg version that apparently fast-tracks the installation, but I ended up not using that either because it didn’t download properly. So here’s what you do to do it manually from the tarball.

and unzip it by just double-clicking on it in Finder. From Terminal, run the following

cd /usr/local
sudo mv (copy path) ./
sudo ln -s full-path-to-mysql-VERSION-OS mysql
cd mysql

Here’s a nice trick I learned recently. You’ll notice the code above has (copy path) in it for the path to the mysql folder you just unzipped. Instead of typing it in, you can click and drag the folder from Finder to Terminal, and it’ll fill in the whole path for you.

So if you want to do things properly, you’ll create a separate mysql user, apparently, and let everything run through that. In my case, I’m not planning on ever using MySQL locally for actual production, so I lazily just have everything running from my user account. To complete setup

scripts/mysql_install_db
bin/mysqld_safe &
mysqladmin -u root password NEWPASSWORD

Where you can fill in the root password. Next, for convenience, let’s add the mysql commands to our path so we don’t have to type them every time. Open up your bash profile and add this line (if it doesn’t already exist, just create it at ~/.bash_profile)

export PATH=./:/usr/local/mysql/bin:$PATH

Of course, if you already have a path there, you only need to append the bit about mysql from it. For convenience in the rest of this setup, copy the same line into normal Terminal and run it to update your PATH variable in session.

Right now, the mysql database is running because of “bin/mysqld_safe &” above. In general, from now on, the command to start it is

mysqld_safe &

and the command to shut it down is

mysqladmin -u root -p shutdown

where you’ll need to type in the root password you defined above. Go ahead and start the MySQL database again. The next little bit is for porting the contents of a web server database to your local machine.

mysql -u root -p
create database DBNAME;
exit;
ssh username@server.com
mysqldump -u root -p DBNAME > ~/DBNAME.out
exit
scp username@server.com:~/DBNAME.out LOCALPATH
mysql -u root -p DBNAME < LOCALPATH/db-name.out

Reference:

  • http://dev.mysql.com/doc/refman/5.1/en/mac-os-x-installation.html
  • http://dev.mysql.com/doc/refman/5.1/en/installing-binary.html
  • http://www.cyberciti.biz/tips/howto-copy-mysql-database-remote-server.html

2) Installing Django

This part is thankfully very easy. Download Django from here and unzip it. I myself ended up using the 1.0 to avoid compatibility issues with our web server, but I’m guessing the latest version is the greatest. cd to the folder you just extracted and execute

sudo python setup.py install

Boom, done. Thanks Django!

3) MySQLdb

Okay, so this is going to let us tie the 2 pieces we’ve put down together. Download MySQLdb from here, unzip, and navigate to that folder. In Terminal, execute the following:

 

ARCHFLAGS='-arch x86_64' python setup.py build
ARCHFLAGS='-arch x86_64' python setup.py install
defaults write com.apple.versioner.python Prefer-32-Bit -bool yes

To check to see if it worked, execute this:

python
import MySQLdb
exit();

So if you didn’t get icky output, then that worked. If it didn’t, I’m not the person to ask.

If that worked for you and you’re not using PIL, then you’re done. Head on over to the Django tutorial if you want to try it. If you’re doing drawing stuff, then you’re not out of the woods yet.

Reference:

  • http://yousefourabi.com/blog/2008/06/django-on-leopard/
  • http://cd34.com/blog/programming/python/mysql-python-and-snow-leopard/
  • http://www.jaharmi.com/2009/08/29/python_32_bit_execution_on_snow_leopard

4) Installing libjpeg and Python Imaging Library

First, you’re going to need to have Xcode installed so you can compile lipjpeg. If that’s not already installed, you can find it on your Snow Leopard installation dvd under “Optional Installs” or from the Apple website.

We’ll start with libjpeg. Download it from here. You’ll probably want one that is of the form jpegsrc.v7.tar.gz, where the version might be different. Again, extract, navigate in, and execute this

export CC=/usr/bin/gcc-4.0
cp /usr/share/libtool/config/config.sub .
cp /usr/share/libtool/config/config.guess .
./configure --enable-shared --enable-static
make
sudo mkdir -p /usr/local/include
sudo mkdir -p /usr/local/lib
sudo mkdir -p /usr/local/man/man1
sudo make install
sudo ranlib /usr/local/lib/libjpeg.a

Don’t ask me what a lot of that means. I just copy what people tell me to do.

Finally, we can install the PIL. Download it from here, extract, and navigate it. The last Terminal commands are

python setup.py clean
python setup.py build
sudo python setup.py install

And that should be it.

Reference

  • http://stackoverflow.com/questions/1438270/installing-python-imaging-library-pil-on-snow-leopard-with-updated-python-2-6-2
  • http://jetfar.com/libjpeg-and-python-imaging-pil-on-snow-leopard/
  • http://snippets.dzone.com/posts/show/38
  • http://stackoverflow.com/questions/1398701/problems-with-snow-leopard-django-pil

Wrapping Up

So there you go. To get your apache server running to make your Django app go, go to System Preferences->Sharing->check “Web Sharing”. Your Django app can live where it is, but if you need to drop media files somewhere, the www root for Mac OS is /Library/WebServer/Documents/

Have fun!

Don’t rsync /var with –delete

Sunday, July 19th, 2009

About 10 minutes ago, I learned an important lesson: make sure you fully qualify the folder you want to sync. And don’t use the delete option if you’re not entirely sure how to use it. The bad news is that I accidentally deleted this website and panicked for a second.

But I like to look at it on the bright side. For about 10 minutes of downtime, the good news is:

  • There’s a reason why people don’t just use root all the time. I need to be better about this, because it paid off majorly this time to not have permissions
  • my laziness in working on my website template means that I didn’t accidentally erase a bunch of work
  • I now know how rsync works a bit better
  • It caused me to backup my blog database so that the next time there’s a disaster, I’ll only be partially screwed
  • I didn’t need to upgrade wordpress because a fresh install does that, too

Wow. That could’ve been a lot worse.