Tuesday, September 16, 2008

LabView

The past day and a half have been spent dealing with more failure.  This time, the problem was that the FPGA on the cRIO needed an updated image.  The process for that should have been:

  1. Download new image to PC.
  2. Extract image from zip file.
  3. Open the Development Environment.
  4. Click "Download Image"

Instead, it was more like this:

  1. Download new image to PC.
  2. Look for download option in IDE.
  3. Failing that, find the directions online.
  4. The directions make it sound like LabView is required.
  5. Download LabView (570 MB).
  6. Install LabView.
  7. Run LabView and try to follow directions.
  8. There's no "Real-Time Project" option.  Google that; find out LabView Real-Time is an add-on.  Download that (370 MB).
  9. Install LabView RT and run LabView again.
  10. "Real-Time Project" is an option now.  Go through the instructions until reaching "Connect to Device."
  11. Try to connect to cRIO.  There's no option for the cRIO in the list of devices.  Try anyway.
  12. Connecting fails, so spend an hour looking for the software for the cRIO.
  13. I think I need the FPGA Module.  There's no trial for that, and the full version is $3k, not counting the $3k I would have to spend on LabView.
  14. Abandon the LabView FPGA Module idea.
  15. Find the cRIO image tool, and resign myself to the fact that I'll have to re-image the whole device.
  16. Run the tool.  It can't find the LabView runtime environment.
  17. LabView 8.6 is not compatible with LabView 8.5, apparently.  Download the LabView 8.5 runtime (98 MB).
  18. Install LabView 8.5 runtime.
  19. Run imaging tool again.  Now it's missing NiRioSrv.dll.  Search Google.
  20. Find out I probably needed the NI-RIO component to begin with, so download that (870 MB).
  21. Install NI-RIO.
  22. Find out that I actually need the LabView Real-Time module, version 8.5.1, not 8.6, which isn't available anywhere.

That took all day, and we made no progress.  Hopefully we can get this fixed.  It's currently preventing all progress.

Thursday, September 11, 2008

Pointers = Fail

Yesterday, Squawk stopped working on our cRIO. Today, we fixed it. The problem? This block:
char** fn;
symFindByName(sysSymTbl, (char*)symbol, fn, &ptype);
Notice a problem? No, it's not that sysSymTbl, symbol, or ptype are undefined (they are, i just left them out). See the char** fn? Yeah, a pointer to a pointer to a char (it's being used as a pointer to a void*, but the symFindByName prototype is stupid). What's the problem with that? I'm allocating fn on the stack. Nowhere do I allocate the memory that fn will point to. What I meant to do was this:
char* fn;
symFindByName(sysSymTbl, (char*)symbol, &fn, &ptype);
What did that fix? Now I'm allocating fn on the stack, and giving the address of the stack memory to symFindByName. So when symFindByName writes the address of the requested symbol to *fn, it doesn't write to random memory.

The worst part of all this is that it was working before. I don't know why, and I'm not sure I want to know. I spent the better part of two days looking for this bug. I hate pointers.

Monday, September 8, 2008

It Lives!

Today, we managed to get Squawk to run on the cRIO.  This was after much poking at silly things like the working directory issue mentioned previously.

Some of the errors encountered along the way:

  • We set the page size wrong.  It should have been 16K.
  • usleep() wasn't implemented anywhere in VxWorks.  It shouldn't be in header files if it's not implemented.  It was added to os.c.
  • mprotect() wasn't implemented either.  It was noted somewhere that it only worked on mmap-ed memory, but there was still no provided implementation.  The workaround was to not protect the memory we're using.

The latest error is that the object memory format converter has stopped working.  There isn't any observable reason for this (none of its files changed).  This one is still broken, but I found the option to generate the suite in big endian to begin with.

This is what com.sun.squawk.Test looks like on the cRIO:

Running: com.sun.squawk.Test
x39 count = 1
*** Extending stack *** (stack size=336, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=672, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=2688, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=5376, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=10752, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=21504, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=43008, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=86016, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=172032, remaining stack=5, bcount=-1)
*** Extending stack *** (stack size=344064, remaining stack=5, bcount=-1)
x40 recursion level = 114664
x42 printStackTrace - this should be a java.lang.NullPointerException
java.lang.NullPointerException
    at java.lang.Throwable.<init>(Throwable.java:88)
    at java.lang.Exception.<init>(Exception.java:44)
    at java.lang.NullPointerException.<init>(NullPointerException.java:54)
    at com.sun.squawk.VM.nullPointerException(VM.java:394)
    at com.sun.squawk.Test$FOO.toString(Test.java:420)

at com.sun.squawk.Test$FOO$1.run(Test.java:412)
    at com.sun.squawk.Test$FOO.foo(Test.java:410)
    at com.sun.squawk.Test.x42(Test.java:386)
    at com.sun.squawk.Test.runXTests(Test.java:97)
    at com.sun.squawk.Test.main(Test.java:48)
    at com.sun.squawk.Klass.main(Klass.java:3001)
    at com.sun.squawk.Isolate.run(Isolate.java:1565)
    at java.lang.Thread.run(Thread.java:231)
    at com.sun.squawk.VMThread.callRun(VMThread.java:1495)
    at com.sun.squawk.VM.callRun(VM.java:308)
*** Extending stack *** (stack size=78, remaining stack=6, bcount=-1)
[SNIP]
*** Extending stack *** (stack size=82, remaining stack=5, bcount=-1)
Date: Mon Sep 08 12:07:44 PST 2008
Empty loop timings per 16000000 = 64358ms
Empty loop timings(2) per 16000000 = 27864ms
Empty simple call per 1000000 = 3152ms
Empty long call per 1000000 = 3835ms
Empty exception calls per 1000000 = 4423ms
random time test (14886936 empty loop iterations)... 37283ms
Finished tests
--------------------------------------------------------------------
Hits   -   Class:99.99%  Monitor:97.78%  Exit:100.00%  New:89.60%
GCs: 1 o:p>
** VM stopped: exit code = 12345 **

Friday, September 5, 2008

The Battle of Squawk

Last week was mostly setup, so nothing exciting happened. Lots happened this week, so I'll report on that.

We picked up this week knowing the following:
  1. Squawk vaugely supports POSIX-compliant systems.
  2. VxWorks uses the GCC toolchain to compile programs
  3. VxWorks is (supposedly) POSIX-compilant.
So, our goal was to get Squawk to run on VxWorks. That's not as easy as it sounds.

Step one was to tell the Squawk builder about VxWorks. It needed to know things like the endianness of the platform and where the toolchain lives. This would have been completely trivial, except that VxWorks has to be difficult. Instead of naming the gcc compiler gcc, it's ccppc. I'm pretty sure that calls some other tools, but for our purposes, it just needed to compile code. So instead of just extending the GCC Compiler class, we needed to override getToolsDir, compile, and link.

Okay, so the build system knows about the VxWorks compiler. All set, right? Not at all. The build system also needed to know about the VxWorks platform (things like what the executable extension was) and had to be told how to create a VxWorks compiler.

After much tinkering, mostly with tool paths, the builder ran. So, what's the first thing I do? I try running it. That was a huge failure. The first file had more than 100 errors all having to do with an invalid '*'. Turns out that problem was caused by the builder using the JDK's libraries on the include path. This would be harmless if we weren't trying to cross-compile. So, back to the build system to remove those include paths.

That fixed the '*' error, but then the type 'jlong' wasn't defined. Of course, it was defined in the same file that was causing the '*' errors. The solution to that one? Get rid of everything but the typedefs, and copy the offending header file to the platform-specific include folder (psif when I need to talk about it again).

Good, that fixed a lot of the errors... Except:
"Cannot find malloc.h". Seriously? Why not? VxWorks can't be helpful at all. Everything defined in malloc.h? Yeah, that's in memLib.h. So, wrapper file for that, and into the psif it goes.
"Cannot find sys/time.h". Everything has a sys/time.h, so I look at the include folder from VxWorks. There's a time.h (not in a sys folder), so that must be it. Drop that in, and... fail. There's no struct timespec defined. Why is this? Turns out there's time.h and sys/times.h. I needed the second one. Luckilly, this was only included in one of the platform-specific files, so a quick swap of the include saves the day.
"Cannot find netdb.h". This one was stupid. The netdb.h file isn't in the default include paths. It's stored in some other place, because apparently no one on VxWorks needs networking by default. But to add an include path... I have to change the builder.

That solved the missing files. What's left?
dlsym - This function gets used by Squawk when it needs to call a C function from Java. VxWorks happily defines the prototype, and doesn't implement it. It also has it's own symbol table that it uses. Add a wrapper function for that, and cross another error off the list.
open - VxWorks has an open function, but it takes 3 parameters instead of 2. Trying to add a wrapper resulted in complaints about it already being defined. My solution? Make it a macro. Now I have an open macro that calls the open function.

Finally, we have a build that works. Except the linking fails. Why does the linker fail? It doesn't need to do anything special, and it wasn't missing libraries. Well, I had accidentally left in a couple of the normal GCC linker options, and ccppc didn't like them. I think it was the -ldl option that broke it. Removed those, and... success! I have squawk.out and squawk.suite files in my working directory.

How do I get them onto the test system?

FTP makes the most sense. Let's try that. They can go in the system folder with all of the other .out files. Easy enough. Now, how do I run them? sp "squawk" failed epically, so I resorted to using the WindRiver Workbench to load up the file and execute it.

Did that work? No, of course not. We have this incredibly unhelpful exception detail (like pointers to memory unhelpful), and attaching a debugger gives me assembly code. After some Googling, I find the problem. VxWorks doesn't support argv/argc correctly. VxWorks has this clever solution where if your program needs three parameters, you have to write main(char* param1, char* param2, char* param3). Why can't these things ever be simple? Now Squawk has its own entry point for VxWorks (it currently only takes one argument) that maps the unhelpful variables into argv/argc.

So we're finally up and running... Sort of. squawk -X will happily give you usage instructions, but squawk com.sun.squawk.Test fails. Why? First, because it couldn't find the squawk.suite file. This was one of the more stupid things I've seen. VxWorks looks in the working directory for squawk.suite, but the working directory is set when you connect to the COM port and cd somewhere. There must be a way to set it from code, so I'll have to hunt for that. Once it could find squawk.suite, it immediately complained about the endianness of the suite file being wrong (it was generated as a little-endian file, but needs to be big-endian like the C part). We haven't gotten to that yet, so hopefully on Monday we'll have Squawk running Java 'Hello World' programs.