Programmer Tips
Parameter Order
Password Security
Sites are compromised all the time and the public, in general, just isn't aware of most of the minor ones. It's just when passwords are leaked from larger sites and it makes the local news, only then do people care. At the time of this article's writing, LinkedIn was the most recent big hack where 6.5 million password hashes were leaked to the public. At least they were doing things a little better than many places - they hashed their passwords but unfortunately didn't use salts. More on that later. In order to understand good password security, I will first give you a breakdown on what hackers will do in order to find your password. With this knowledge, you can understand why you need to use better passwords and different passwords for each site. I will also give you real-world examples for password cracking times, password cracking keyspace, and more. For reference, this is on an HP Pavilion g6 with an Intel Core i3 2350M (2 core with hyperthreading) processor, running at 2.3 gigahertz and using an OCZ Agility3 SSD, which makes loading passwords and hashes faster. What is password hashing?Thankfully, most places know that storing your plain password in a database is a really bad idea. For example, a plain password will look like "Testing". Instead, developers will use a variety of techniques to turn this into a really large set of letters and numbers. This is typically called hashing. "Testing" as a hash looks like "fa6a5a3224d7da66d9e0bdec25f62cf0" or "0820b32b206b7352858e8903a838ed14319acdfd". Hashes are computed using a method that doesn't allow you to go backwards. You might have one of the hashes above, but you can't go backwards and get "Testing" from it. You can only go forwards. When you log into a system, it generates a new hash with the password you just typed. If the two hashes match, you can log in. While it is impossible to go backwards, hackers can certainly take an entire dictionary of words and hash them all really quickly, checking if any of those hashes equal your password. For the LinkedIn breach, I can scan all 6.5 million hashes against various wordlists, each time starting from scratch. It takes 12 seconds to load an empty wordlist since it takes that long just to read and prepare for the cracking attempt. Against my list of the top 7184 passwords, I found 3854 hashes in 14 seconds. Against my list of 97 thousand English words, I recovered 22,572 in a mere 15 seconds. When I threw my huge wordlist of over 18 million words from various languages and other password breaches and I mixed in the mutation engine to generate more likely passwords from the wordlist, I was able to crack over 390,000 of the hashes in a mere 129 seconds. Hashing methods vary, and newer ones are typically more secure (ie. fewer "collisions"), harder to compute and slower to compute than older ones. It's good to know that people who take security seriously aren't making the hackers' lives too easy. Brute Force AttacksI was using a wordlist to generate possible passwords to try, but another approach is to just use "brute force" and guess every possible password combination. There are even rules that can be applied to the guesses to make it more likely that a real password gets generated. It comes from us being humans and how we build our words. For example, the letter "C" is often followed by "H" and only rarely it is followed by "K" at the beginning of a word. I flipped the password cracker into brute force mode and let it run for almost exactly 48 hours. On its own and without any help from any wordlists, it provided over 2 million passwords. Distributed CrackingThe numbers I have been listing are for me cracking hashes on my own with just my laptop. Imagine if I had a bunch of computers at a college available to try passwords simultaneously. What if I spun up 20 Amazon cluster compute instances to crunch numbers for a mere day? What if a bunch of hackers got serious about cracking a set of passwords and decided to pool all of their resources together? Rainbow Tables and SaltsThere are additional password cracking techniques out there to speed up the cracking. One of them is called "rainbow tables", which is where some hashes are partially generated ahead of time, saving the up-front cost of starting a hash over and over. It really speeds up the efforts when used against a susceptible cipher. By "really speeds up", I am talking about cutting days of cracking down to just minutes. System administrators have tried to combat this issue by using "salts" with passwords. A simple way to think of the salt is to take the hash of your password with "blah blah blah blah blah" at the end of it. Because it is significantly longer, the salted password takes far longer to compute. A better salt would add "blah blah blah blah blah" to the hash and then hash that again, but we don't need to go into that here. I merely wanted to point out that there are techniques that can be applied to slow down the process. Precautions You Need To TakeNever reuse passwords!If Mr. Evil Hacker gets your username and password from one website you visit, would you want Mr. Evil Hacker to also be able to get into your email? Bank account? Many people made the mistake of using the same password for LinkedIn as they did for other sites. Now, one must assume that all of the sites are compromised and your personal information may have been leaked to unsavory characters. Don't write passwords downYou're probably thinking "How am I to remember my crazy passwords for each site?" Writing them down leaves them in plain text. It could be hiding on your desk or maybe a scrap of paper in your pocket, but it's insecure. Someone could easily walk over and read your passwords. Use a password manager if that helps. Depending on your needs, but maybe some software that runs on your phone is ideal. Others use secure storage of passwords in their web browsers, like Chrome and Firefox; it is best to guard this with a "master password". There are online password storage solutions like LastPass and Clipperz that can integrate into your browser. Just make sure you can back up the storage of the passwords and that you can get the passwords easily whenever you need them. Make up stories and use the third letter of each word for your password. Or the first. Or use poems. Keep a book with you and assign each site a page number, then use the 10th letter from each line down the page. Use Diceware or another system to generate truly random passwords. Diceware uses a large table of English words, so use this method only when you can make passwords of tolerable length. I would use at least five different words before feeling good about a random website. Use randomly generated passwordsThis eliminates the bias that people use when generating passwords. If an attacker knows you speak English, they will probably generate passwords that look like English words. "gpdswoir" is far stronger than "homework" even though they are the same length. That's because Mr. Evil Hacker will first attempt to use a wordlist like I did to get as many passwords as possible. Even the tips of "add a number at the end" and "change i into ! and a into @" are represented in mutation rules that can be applied to wordlists. Your password really isn't much stronger since it is, in the end, just a dictionary word. Use different types of charactersUsually, randomly generated passwords usually mix in uppercase, lowercase, numbers, and symbols. Let's say you only used lowercase letters. You would have only 26 options available at each position of your password. This is called your keyspace. To calculate the number of possibilities for a given password length, you multiply the keyspace by itself. If you are looking for any single-letter lowercase "password", you have a mere 26 options. If you want all two-letter passwords, that would be 26 * 26 (26^2) = 676 options. If a site forces all passwords to lowercase or uppercase and you type in an 8-letter word, that's only 208 billion possibilities, or 2.0x10^11. 208 billion! That's a lot, you may think. Even with hundreds of billions of possibilities, it looks like it would only take 414 hours to go through them all on my computer. Yep, if you used 8 or fewer characters in your password, it doesn't really pose a challenge and I would certainly get it if I tried. With the advanced algorithms out there and wordlists, I'll probably still be able to crack most password hashes in the first 24 hours. So, those 208 billion possibilities (or 2.0x10^11 possibilities) is not nearly large enough to thwart a concentrated attack. We're going to start dealing with really large numbers here and your goal is to make the exponential part (the "11") much larger. Each time you can get the exponential part even a single digit larger, it takes 10 times the computing power to search all possibilities. By using different types of characters, such as uppercase, numbers and symbols, you increase the keyspace dramatically. Instead of a keyspace of merely 26, now you increase it to 26 (lowercase) + 26 (uppercase) + 10 (numbers) + 32 (symbols) = 94 characters. A randomly generated 8-character password using a keyspace this large can make about 6.01 quadrillion different passwords, or 6.0x10^15. Again, we should focus on that "15". We just made your password 10,000 times harder to guess. According to my computer's statistics, my laptop could crack any 8-character password in about 1,100 days. Assume hackers coordinate their attack and pool their resources. Let's say we get a team of merely 100 hackers, each with 10 big machines (potentially a REALLY low estimate). With this dedicated group of hackers and access to more powerful machines, all 8-character passwords could be cracked in just over a day (about 26-27 hours). With botnets and hundreds of thousands of drone computers at your disposal, you could crack this in hours or minutes. Size really mattersEach character increases the difficulty of the hack exponentially. Depending on your keyspace, this could mean significant changes. Assuming your keyspace of 95 characters and a length of 8, there are 6.6x10^15 possibilities. By including just one more random character, we can generate 6.3x10^17. One extra keypress means it is almost 100 times harder to guess. The cracking time for my laptop went from about 1,100 days to about 105,000 days. The dedicated group of hackers now would spend 1/3 of a year instead of a day. A botnet equivalent to 100,000 of my laptops would still get this password in just one day. If a site lets you use 12 characters, that's far better. If the site doesn't restrict length, you could use 20 or more characters. With a 95-character keyspace, 12 characters can produce 5.2x10^23 possibilities and 20 characters can make 3.5x10^39 different combinations. We're going for computationally infeasible, and this certainly qualifies. Use spaces tooThere's a lot of password crackers out there that don't crack multi-word passwords by default. At least add the 96th character to the keyspace. With an 8-character password, we increase from 6.6x10^15 to 7.2x10^15, which is only a minor jump, but we've now eliminated the normal use of wordlists and people will have to crack your password using non-default techniques. Change your passwordThe longer that someone has to crack your password, the more likely they will get it. Why leave that window of opportunity open for so long? I'm not advocating changing your passwords daily (which can also be a security risk), but perhaps change them yearly, or change the ones you care about with every season. If there is a breach, change your password right awayIt's likely hackers had your password in their hands for quite a while before a company admits it was hacked. Before I got an email from LinkedIn, I had the 6.5 million password hashes in my hands and already found that a password matching mine was leaked. Use two-factor authentication when possibleNot many places let you do this, but it is difficult for people to guess a password. It is next to impossible for them to just guess your password and the number from a two-factor authentication method. There is software for smartphones and key fobs that can be tied to web sites to generate a new number every minute automatically. Instead of just relying on something you know (your password), they also rely on something you have (the number generator). Assume there is no securityOften there isn't any. Lots of sites store your password without any encryption or in a way where they can get the original password back. Other sites mess up and encrypt the password poorly or rely on obfuscation instead of real security. If someone gets into one of your accounts, they may try that username and email address with that password elsewhere. They might be able to see the password recovery questions and answers, then try to use those on other sites. If they hacked your email address, they might try getting your password reset on sites and intercept the email so they can now gain access to additional sites. Be careful. If your information gets exposed, you may be at a bigger risk than you realize. Plan carefully and try to make each account as individual and separate as possible. When you assume your password will be compromised and you plan for it, then news of password leaks at LinkedIn (or any other place) won't have you worried at all. |
VMWare Tweaks
Best Programming Language?
Shrinking VM Disk Images
I have been asked to compress dynamically-sized virtual disk images more than once. These instructions can apply to VMDK files (common for VMWare) and VMI files (VirtualBox). This sort of request seems to come up every year or two for me. Usually it is because some place is gearing up to distribute these disk images and serving up gigs of data is undesirable. I come up with the same sort of steps time and time again. Instead of recreating this work for the next time I get asked, I'm posting these instructions online to record them publicly. I've found that they are more thorough than what I find on other sites, so perhaps you could benefit from these instructions too. First Step - BackupMake a backup. The steps below can really destroy images; follow them AT YOUR OWN RISK.Reconfigure The MachineBefore you distribute the disk image around, you may need to tweak the configuration so that other virtualization tools will work with your image correctly.Disable Network Configs Via MAC AddressesIf you use kudzu, you should disable it or it may prompt you when you start up the VM and it has a new MAC address. Kudzu ships with older Red Hat and CentOS.chkconfig kudzu off rm /etc/udev/rules.d/70-persistent-net.rules cd /etc/sysconfig/network-scripts perl -pi -e "s/^HWADDR/#HWADDR/" ifcfg-eth* Free More SpaceIt is a common misconception that deleting files on your dynamically sized disk image will make the disk image shrink. This is not true - the virtual machine software doesn't peek into the filesystem to determine that sectors aren't needed any longer. More on this later ... For now, let's focus on making some room.Delete temp files. They shouldn't be needed. # Linux variants Clean your package manager's cache on Linux: # Red Hat, CentOS yum clean all Defragment The DriveThis step really isn't needed, but it could help to squeeze out a few more bytes if you are really concerned. There's really no defrag for Linux. For Windows, I suggest using UltraDefrag.Wiping Free SpaceEven after you delete the files, the hard drive image still has the contents of the old file on it. This is why programs like photorec can work. We need to wipe the data clean off the drive by writing NULL (hex 0x00) bytes to all of the free areas on the drive. This still doesn't make the image any smaller. More on this later ...Wiping Linux From CD
The easiest way to wipe extfs filesystems (ext2, ext3, ext4) is with zerofree. It's the faster choice. You can download the iso image of Parted Magic and configure your VM to mount that as a virtual CD-ROM. Boot from it, then open a terminal by clicking on the black monitor icon at the bottom. From there, it is a few simple commands: |
Public DNS Pointing To localhost (127.0.0.1)
When you are developing and using a local development environment, you typically need to hit your own site. A lot. You'd use URLs that look like this: http://localhost/ http://127.0.0.1/ When you get slightly more advanced, you would want to run multiple sites off your installation. You can easily do this with name based virtual hosts (eg. with VirtualHost directives in Apache's config). Now you want to use urls like this:http://client1.local/ http://client2.dev/ http://client3/ Those URLs don't work, so now we need to find some way to map our domain names to the "localhost" address. What if we could map hostnames to 127.0.0.1 and make this work? Hosts FileThe first and easiest method is where one edits their hosts file ( /etc/hosts in Linux, C:\Windows\System32\Drivers\etc\hosts for some versions of Windows) and add lines like this:127.0.0.1 client1.local 127.0.0.1 client2.dev 127.0.0.1 client3 At work, we have up to five different hostnames for each of our clients. Adding yet another client means dozens of developers that now need to edit their hosts file. Oh, the pain and agony when you have to do this for hundreds of domains! What if we could have a single top-level domain that always resolved to localhost? DNS Entries - WindowsIf you are using Windows DNS, you can create a new zone: dnscmd /RecordAdd local * 3600 A 127.0.0.1 dnscmd /RecordAdd local @ 3600 A 127.0.0.1 dnsmasq - Linux, MacOSOn Linux systems, you can install dnsmasq to pretend to be a real DNS server and actually respond with 127.0.0.1 for all subdomains of a top level domain. So, if you wanted *.local to always resolve to your own domain, then you can use URLs like this: http://client1.local/ http://client2.local/ http://client3.local/ You only need to install and set up dnsmasq. There's some well-written instructions at http://drhevans.com/blog/posts/106-wildcard-subdomains-of-localhost that you can follow; I won't repeat them here. The drawback of this setup is that you now have to install and configure dnsmasq on every machine where you want to use this trick. What if someone set up DNS entries and basically did this for you? Available Wildcarded DNS DomainsIt turns out that some kind hearted people already set up wildcarded domains for you already. You can use any top level domain below and any subdomain of these and they will always resolve back to 127.0.0.1 (your local machine). Here's the list of ones I know about. Let me know if there are more!
Now, with these wildcarded domains, you don't need to do any modification of your system for requests to come back to your own server. For instance, you can go to http://client1.127-0-0-1.co.uk/ and the web page request will always head back to your own server. You'll still need to configure your web server to answer on this hostname, but at least the DNS portion of the problem is now solved. |
Escaping Strings
There is a lot of confusion out there about the proper way to escape strings in different languages for different purposes. I recently had a discussion with an acquaintance regarding the correct way to escape a regular expression in PHP. To that end, I wrote up an email to him in an attempt to explain why I said he didn't have enough backslashes.
Let's pretend we want to write a regular expression to remove all periods. I'm using only a couple languages to better illustrate my point, and please don't mention that the JavaScript one doesn't really need a RegExp object. Remember, this code is designed to show you a tricky part about escaping.
// Version 1
// PHP $result = preg_replace("/./", "", $input); // JavaScript var regexp = new RegExp(".");
var result = input.replace(regexp, "");
// Version 2
// PHP $result = preg_replace("/\./", "", $input); // JavaScript var regexp = new RegExp("\.");
var result = input.replace(regexp, "");
There, done.
Or am I? When doing escaping in strings, the backslash character is often the indicator that the next character is treated differently. For instance,
\n translates to a newline character, \t becomes a tab character and \\ means to put in a literal backslash character. The real string that we want passed into the Regular Expression engine is literally, \. (a backslash and a period = an escaped period). We need to take that string and then escape it again to embed it in our code properly.// Version 3
// PHP $result = preg_replace("/\\./", "", $input); // JavaScript var regexp = new RegExp("\\.");
var result = input.replace(regexp, "");
Finally our code is correct. I've received several questions about the multiple levels of escaping, so let's anticipate some questions and provide useful answers right away!
Why are we escaping the period twice?It's because the string goes through two levels of unescaping before being used - first it goes through PHP's string unescaping and then through the regular expression engine's string unescaping.
Why does Version 2 still work?Great question, and I think this is the source of the confusion about string escaping. Version 2 still works because
\. doesn't "unescape" to anything. Instead of choking and dying, the software will just let the two characters go through. Better yet is this list of unescaped strings:
If Version 2 works, why worry about proper escaping?It is doubtful that the string processing engine of the different languages will change much in the future. However, it could help you avoid problems. Let's pretend you wanted to match a literal backslash and any character. You'd want the pattern
\\. and in both of the example languages it should be escaped as "\\\\." (yeah, four backslashes = 1 literal backslash because it gets unescaped twice). If you get your escaping messed up or don't know how many levels of unescaping will happen, you would get unexpected results. If you only used the string "\\." in either language, it would match only periods, not a backslash followed by any character.In PHP there are these single-quoted strings where you don't need to escape ...Sorry, that's wrong. You must still escape there. Try
echo '\\' or echo '\'' (that's two single-quotes, not a double quote at the end). Without the escaping in single-quoted strings, you would not be able to embed an apostrophe. Most of the escape characters are disabled, however, so sequences like \n and \t will not produce a newline nor a tab.In conclusion ...So, armed with this knowledge, I could ask you to escape the regular expression that looks for a backslash, a period, a double quote, and a slash. You'd be able to produce the following:
// Looking for: \."/ // Escaped for Regular Expression: \\\."/
// PHP - Escape backslashes, slash, double quote $result = preg_replace("/\\\\\\.\"\\//", "", $input); // JavaScript - Escape backslashes and double quote var regexp = new RegExp("\\\\\\.\"/");
var result = input.replace(regexp, "");
|
1-7 of 7