Ubuntu 9.04 & CPyrit-Stream now working!
Yay!
I've finally gotten the Pyrit program running and utilizing ATI Stream! I followed these instructions to the letter, though i built RPM from source with the patch for LZMA compressed RPM's, which did the trick (although, i've also read the 1.4.0 beta 2 package of the Ati Stream SDK doens't have this problem, but anyhow). I think i also had to apt-get some libraries that were missing, but they were listed pretty well in the instructions.
As for building pyrit, i used the instructions in the wiki, that can be found here. I ran in to an error while building the pyrit source, but that was fixed by doing an edit in a file according to these instructions. Fixes for other common errors are in the installation wiki.
So for the order: Install Atistream and Atical according to the instructions in the KB. Apt-get any packages you are missing. Build and install Pyrit, then CPyrit-Stream.
Run the command pyrit list_cores, which should show something like the screenshot below, and then run pyrit benchmark to see what kind of numbers you're getting on your hardware. I am amazed. Compare the over 8000 PMKs/s (pairwise master keys), with the ~700 of one Phenom II X4 940 cores. Look at those results (yes yes, synthetic benchmark..):
You'll note that it only shows three of the four cores on my Phenom, this is a feature. For every GPU core that it handles, it saves one CPU core for scheduling tasks.
A man can always dream... That there is about 3000 euros worth of hardware (four Nvidia GTX295's, a motherboard to support 4 Pci-e cards, processor, memory.. i guestimated). 80 000 PMKs / s (or half of that, depending on how you read the benchmarks). It seems to see the cards as two cores each.
Edit for 15.8.09 - I'm working on a proper howto for this thing since the internets seem not to have a coherent guide for a current ubuntu version. The 8.04 guide is great, don't get me wrong, but i think it could be more complete. I've also e-mailed AMD to ask about providing .deb packages on my / their site, and or publishing the new howto.
Ubuntu 9.04 x64 & Pyrit with ATI Stream
Okay, so since i just got the new graphics card (an ASUS EAH4850), i wanted to try out some of the GPU computing possibilities of the card. The Pyrit project exists to take advantage of multiple GPU computing platforms, such as Nvidia CUDA, and ATI Stream, so i decided to give that a whirl.
I downloaded the Pyrit and the Ati Stream packages from the Pyrit site. I found out i also need the ATI Stream SDK, which can be obtained from the AMD site. The thing to be noted here is that there is currently only support for RPM based systems, such as Fedora, CentOS etc. So of course, i thought, "Alien!", the package converter. I apt-get'ed Alien and RPM, and got working on the thing.
You download the package, which is a .tar.gzip file. Unpack the file to get to the .run file. The run file can be exectured simply with ./filename.run. This should result in the script from the .run file being executed. It'll fail shortly after the EULA, or it did on my x64 system.
I opened up the run file, and commented out the part where it deletes the temporary folder where it extracts the actual RPM file (and before that, tries to run rpm on the file, which fails).
#!/bin/bash
echo "ATI Brook+ SDK Installer"TMP="atibrook"
HERE=`pwd`
DST=/usr/local
FOPEN="more"
RPM="alien"#Extract archive into /tmp/atibrook
echo -n "Extracting archive..."
dd if=$0 of=/tmp/${TMP}.tar.gz bs=1 skip=16384 >& /dev/null
echo "DONE"
mkdir /tmp/atibrook
cd /tmp/atibrook
echo -n "Uncompressing package..."
tar -xzf ../${TMP}.tar.gz
echo "DONE"#Accept EULA
${FOPEN} EndUserLicense.txt
echo -n "Do you accept this license agreement? [y/n]: "
read agree
if test A"$agree" = Ay -o A"$agree" = AY; then
echo "You accepted the license, continuing installation."
else
echo "You declined the license, aborting..."
rm -rf /tmp/atibrook
rm /tmp/${TMP}.tar.gz
exit
fi#Install via rpm
echo ""
echo -n "Select a path for installation [default]: "
read USERPATHif test "$USERPATH" != ""; then
echo "Using '$USERPATH' for directory prefix."
echo ""
echo "Installing package via RPM..."
$RPM --prefix=$USERPATH /tmp/atibrook/*.rpm
else
echo "Using default directory /usr/local/atibrook"
echo ""
echo "Installing package via RPM..."
$RPM /tmp/atibrook/*.rpm
fi#### THIS PART I COMMENTED OUT SO IT LEAVES THE RPM INTACT ####
#Cleanup
#echo ""
#echo "Removing Temporary Files..."
#rm -rf /tmp/atibrook
#rm /tmp/${TMP}.tar.gz
echo "Exiting installation..."
exit
So the result is that in /tmp/atibrook you now have the rpm file.
Running Alien against it results in an error about rpm.pm on line 155. Something relating to perl, the complete error is:
Installing package via RPM...
Unpacking of '/tmp/atibrook/atistream-brook-1.4.0_beta-1.x86_64.rpm' failed at /usr/share/perl5/Alien/Package/Rpm.pm line 155.
Exiting installation...
Now, i have no fucking idea how to fix it. Looking at line 155, it relates to the cpio command not working properly, but how and why and what the fuck? I'm not a developer. I'll need to show this to someone, like B, maybe he can figure it out.
I also tried instructions i found on the AMD Developer Forum (requires registration). These detail the use of rpm2cpio, instead of alien, but that doesn't work either. The RPM seems malformed somehow. Perhaps as a result of it being made with a specific tool (the name of which escapes me), which creates files that are unreadable by rpm2cpio.
Blargh. I'm gonna run a Fedora 11 live CD soon, and see that it actually works. Get some numbers off this thing. It's supposed to do 7800 PKM's, which is a lot faster than for instance an Intel I7 920. Sweetness.
Agamemnon Updates
Alrighty then. I had the wonderful opportunity to get a lightweight UPS from my good friend G, who has no need for one. It's a Powerware 5110, with a USB interface, surge protection and RJ-11 filtering.
That puppy is now hooked up to Agamemnon, the main server on my network. The server has dual power supplies, so im not sure how the proper way to hook it up would be, but the way i did it now is, one power supply is hooked up to AC power, the other through the UPS. I pulled out the AC power one, and then the UPS AC power cord, thus leaving the server on one power supply and the UPS battery. The UPS started beeping to tell me it's running on batteries, but the machine worked like a charm.
I also added a 146 GB RAID-1 array, called /dump, for general file storage for my users and myself. The one disk apparently is faulty, so i'll need to swap that fucker out today. Lucky it's RAID1
Intel 915 chipset and Windows 7
Lo! The latest incarnation of Windows Vista, also called Windows 7 has/had problems with certain integrated graphics chipsets, particularly lower end Intel chipsets, such as the 915 that i have in my Thinkpad X41. This was a shame, because it would only run 1280x1024, and had no chance of running anything fancy, since it was being detected as "Standard VGA Adapter".
There were no drivers either from Intel or Microsoft for the longest time. But Microsoft released a driver that claims to work with the Intel chipset i had. It just popped up on Microsoft Update, so i thought i'd give it a whirl. Usually the drivers microsoft releases are not perhaps the most optimized, but they mostly work.
This one did not.
The driver installed, and the hardware showed correctly in Device Manager. It wanted to reboot, so i did. And after that it was back to the status quo. No driver installed, and then it started "looping", just trying to install the driver, failing in it. I was miffed, and went back to the standard VGA driver.
But then i came to work, and i had to hook up my Dell E228WFP 22" monitor, and since i couldn't get the native resolution on the standard vga adapter, i was starting to get really pissed off. So i googled for a while, and came up with this thread, which apparently talks about the new driver that does not work..
So a guy offers some advice. Simple advice at that. Download the latest XP driver, and install it using Windows Vista compatibility mode. Driver and Intel Graphics Media Accelerator ..configuration software...whatever works fine! Resolution and all. Apparently this works for all kinds of 8xx and 9xx chipsets, so try it out.
For reference, i'm using the public Release Candidate, latest updates, build 7100.
Sweeeet.
Site update: Header picture updated
So i updated the generic looking header picture that came with the theme to something more suitable to the theme of the blog, i.e. technology. The picture is a cropped version of a public domain picture from the Los Alamos National Laboratory, and depicts Roadrunner, the fastest computer on earth. Those are IBM System x3755's in .. well a lot of racks. More on this awesome piece of machinery here. Also check out this cute collection of supercomputers from the present, as well as times past.
Linksys mods
So, me and P were installing OpenWRT on my Linksys WRT54GL, which went fine, both the 2.4 and the 2.6 version, despite the hilarious error message during the initial flash (see image below). The second part was adding an SD card reader to the linksys, which proved to be more challenging.
The card reader was a little fragile piece of plastic which broke when we soldered the wires on it (luckily it's not too expensive). I think the solderpoints on the Linksys were okay, despite my shitty soldering iron, which was kind of too big. Like going after a fly with a bazooka. I also didn't have any of the apparently necessary paraphenelia, which P ownd me about numerous times. The problem was the damn card reader. The solderpoints are just too fucking small! If you fudge it, you break the reader, and probably end up soldering the leads together, which makes a short, which blows your plan. Which is probably what happened.
Using an old SD card, i used some tape to get the leads on the card directly, which didn't work either, despite the white led lighting up on the linksys PCB. This is supposed to be good, since its supposed to light up when the card is in.
Using the 2.6 kernel version of OpenWRT i got nothing, bupkiss. Loading the modules is fine, but as soon as you pass the GPIO arguments to the running kernel, it shits itself. Nothing on dmesg | tail either, like it doesn't see the card at all.
Using the 2.4 version, and the older mmc. o kernel module, i was able to get some dmesg output, but something is wrong. The Linksys sends out a signal to the card, but the card returns with hex ff which is apparently either an error code, or a sign of the card not even being present or detected. Something is borked, so i'll have to look at this on better time.
Clockwork Saturday
Today it is time to test the limits of my ensemble. I've run some tests with the Phenom II X4 940, which is clocked at 15x200, the default 3.0 Ghz.
I raised it carefully to 16x200=3.2 Ghz, and it hardly registered, as any rise in temperature that is.
I've been running it at 17x200 now, 3.4 ghz, and it looks very stable. Check out the screenshot below. I've raised Vcore with a measly 0.25v so there's still lots to do here. I can probably do it at 1.4 or even 1.5v, and then see how far i can raise the multiplier so it's stable. I'm not looking for an extreme overclock, just a stable, not-hot-as-a-volcano kind of thing, that runs for the 15 seconds it takes to run some synthetic benchmark.
After 2 hours of Prime95 testing to really burn that fucker,the highest i got was 67 degrees, using just the stock cooler, and an open window in my room (no, it wasn't placed directly in front or anything, just like i'd normally have the window open).
It's time to raise the stakes.
Okay, volts up to 1.4, easy up on the NB and HT frequencies, and multipliers to 17.5, i.e. 3.5 Ghz. Post nicely, Windows nicely, but rebooted when running Prime for a while. Temps were okay, so it wasn't that, probably more voltage. More tests later, it's time to go see Terminator: Salvation, with Anteuz.
General stuff, and the Matkakortti
A new machine was added today, a Sun Netra X1. It's basically like a weak version of the Netra T1 that i got earlier. I'm not sure what i'll do with it, but those Sun machines are pretty cool looking, so i couldn't pass it by.
The specs are basically, a 500 Mhz Ultrasparc IIi, 512 RAM, and two IDE disks. No floppy or CD, and two NIC ports plus a serial interface and two USB ports. It could run something like Sun Solaris 8, 9 or 10, or it could run say, the Debian SPARC port. It would take up a light network task perhaps.
In other news, i'm thinking of ditching Windows 7, because it sucks. I'm serious. The transfer speeds with any drivers that are available, are appalling. I was moving a file and it was doing it at around 2.6 MB/s. Booting to ubuntu, i got speeds between 25 and 40MB/s. How can this be? And in Ubuntu, i don't even have to install drivers, or think about write caching, or anything else. It just works. So i can't understand how this shit can be that difficult? I have a modern motherboard, with a modern chipset. The disks are capable of more.
I'm probably replacing the P4 rig inside Agrippa, with the Athlon 64 3700+, simply because i think there's something wrong with the IDE controller on that P4 board. The two drives in one of the IDE-busses keep disappearing randomly, which makes booting anything from them very challenging.
I'm working on making a server for the intranet, as Agamemnon took a place in the DMZ. The inside server would take care of DHCP allocation, and DNS. There would also be a pf machine (possibly one of the Sun machines?) that would handle traffic coming in and going out from my internal network.
I'm starting in earnest to investingate the Matkakortti system that we use here in Finland. It's equivalent to the US and Chicago Metrocard system, except that system is primitive, and based on a magstripe and reader, where as Matkakortti uses an RFID chip to send and receive data.
What i'll start doing now is the following: I'll collect the numbers of cards and compare them to see if there's a difference in the two main card types. The types are the personal card, which is bound (and contains) the information of the cardholder, and the non-user-specific card, which is more expensive, but can be transfered between people in a family for instance. The card numbers should contain some information, as it's a very long string; a lot longer than the amount of cards in circulation.
The card is only used in the capital region. There has been talk of making it Country-Wide, but financial hurdles have so far prevented them from deploying it everywhere. Figures...
Another thing i want to investigate is, getting a device that can tell me if a frequency is transmitting or not. Then, i could see how long the burst of data is between the reader and the card when you show it to the reader. The next part would be to get a reader, and look at the actual data, i.e. send out 13.xx mhz to the card, and watch what it sends back. It's probably encrypted, but it can't be too encrypted, since we are dealing with a very simple, quick system.
Also, i'd like to find out how the busses communicate with some central entity, in order to keep track of what's on your card. A personal card can be recovered at certain service desks, and they have the exact up to date information on what is on your card. For a fee of 5 euro, to recoup the cost of the card, they'll give you a clone of your lost/missing/stolen card, and deactivate the old card. This tells me they can do a system wide lock of a certain card number, as well as know the specifics of your card.
The readers themselves have a buffer, because i've encountered one beeping constantly and displaying a "Buffer full" message on the screen. The device was locked out and could not be used. Supposedly, the beeping only stopped once the thing was turned off, and then needed to be emptied/reset by a technician. I've only seen it once, which leads me to believe that there is a set buffer for a device, and that it perhaps uploads once or twice a day, depending on the line. But how does that work then? It wouldn't be completey up to date in that case.
The other alternative is that it does send data constantly through some wireless link (the trains are bound to have a link for control purposes, some RF thing), and that the reader had just faulted somehow and not handled the buffer as usual, filling it up with people's swipes.
It's an interesting system. As an example, here are the three numbers displayed on the backside of my card:
In the top left edge: 042405535
In the middle: BUSCOM 0523
In the top right corner: F2463001111154998100
If you have a card and want to help me out, send me the info from your card to grelbar ( äet ) grelbar (dot) net.
Utilization
When you have a powerful processor such as the AMD Phenom, you really want to use the full fucking force of that thing. It's kind of like keeping a Ferrari in the garage the whole year if you don't.
So i figured, how much difference would it make, if you benchmarked one core (out of four), versus the full four cores. I ran some tests using John The Ripper, which should be fairly good at loading the processor, as it's mostly just grunt-work. I added on the MPI patch, which allows you to use the mpich2 framework to run John on multiple processors/threads and even on a cluster of machines over the network.
The result on one core was i think 4400 raw MD5 hashes per second (correct me if i'm wrong here), where as on all four cores, using 8 threads, the result was an impressing 27400 hashes per second. I have no idea how it technically works, but i can say from the ./john --test benchmark mode that it was indeed faster.
Comparing to an older machine, Agamemnon, which was two 3.0 Ghz Xeon's (the first 64 bit ones i think), the result on both cores, 4 threads, was ~11 000 hashes per second.
It was nice seeing all four cores at 100% load for the duration of the test. Normally, just one is used, and the others do "something", between 0-20% in load, while one core is used more fully.
To run john the ripper like this, i did the following (i'll document this here, because MPI's site didn't have all that good a documentation):
- Use your favorite package-manager to download at least OpenSSL, and the mpich libraries (do a search, and get the ones listed -dev), or download and compile if you do it that way
- Download and compile john the ripper, with all necessary patches (such as the MPI and Jumbo-patch). Be sure to use the machine-type as correctly as possible when you issue make, e.g. make clean linux-x86-64, for a 64 bit version. Issuing the make command alone will give you a list of the supported architechtures.
- Download and compile the mpich2 set. Download any dependencies, should you need them.
- After this, create in your home directory the file .mpd.conf, and chmod it to 600.
- Start mpd using mpd &
- Go to the run directory under the John main directory, and issue for instance mpiexec -n 8 ./john --test . This will run the benchmark mode of John the Ripper, using the mpiexec plaform, and running 8 processes. Depending on your processor, you may want to change this number.
- ...
- PROFIT!



