This project is read-only.

Managed x86-64 assembler

Sep 15, 2011 at 10:16 PM

Hello,

I read somewhere that you want to replace the NASM asm-to-machine-code part in IL2CPU by something which assembles machine codes directly. As it happens, I am writing an open-source managed (e.g. C#) assembler which will assemble x86-64 (or any other instruction set, if you implement it) to machine code. Now I would like to ask a kind of lame 'preference' question, but I think this is the right spot for this question. Here comes:

What operand order makes more sense (i.e. do you prefer) when creating a new instruction object, and why? (Source, Destination) or (Destination, Source)? (I.e. GAS order or NASM order?)

And while you're at it... do you know of a nice name for the project? Something which doesn't have special characters in it that are unsearchable with Google (e.g. not "#ASM") and is a bit unique and descriptive?

Thanks, and keep up the good work!

Sep 15, 2011 at 10:23 PM

Destination, Source is the way we currently emit it, and it would require a significant refactoring to emit it the other way around.  As for a name, perhaps SASM (SharpAssembler), to keep with the numerous existing assemblers (all in native code though :() :P 

Sep 26, 2011 at 11:01 PM
Edited Sep 26, 2011 at 11:02 PM

Alright then. SharpAssembler 0.7 has been released on Sourceforge.

Sep 26, 2011 at 11:17 PM
Edited Sep 26, 2011 at 11:20 PM

While yes, I can understand your reasoning behind releasing it under GPL, and while it currently doesn't cause any issues (as it would currently only be used at build time), that license could cause issues when we move to jitting code (and ultimately running Cosmos in itself), because it would have to be included within the OS, which is allowed to be distributed commercially. GPL doesn't allow for commercial distribution. A potential resolution to this problem could be reached by adding an exception to the license which is similar in language to the license I have on my folder, which allows use only within Cosmos and things created with the Cosmos Toolkit, (aka. OS's compiled with IL2CPU) that includes commercial products. (@devs, we probably need to check to make sure my license would properly cover use within Cosmos, and OS's created with Cosmos, because while I'm pretty sure it covers it (and dis-allows use anywhere else), more eyes on it are always a good thing)

Also, you may wish to check on your released file, it's missing an extension :P (Note to anyone downloading it: It's a .zip file and can be renamed, and opened as such (or you could just hit 7 Zip->Open as Archive :P )  )

Sep 26, 2011 at 11:52 PM

Thank you for your feedback. I re-uploaded the file, now with the correct extension. I'll have a look at the license, because I would like to achieve two goals: firstly, I would like to enable all kinds of open source projects (like Cosmos) to use and distribute my library; and second, that it remains something which is unique to the open source community - to make it stronger.

Sep 27, 2011 at 8:27 AM
orvid: I personally doubt such exceptions are good to make, but hey, i'm not a lawyer....

Anyway, virtlink: best would be to first start a discussion if we would use the managed assembler.....


On Tue, Sep 27, 2011 at 12:52 AM, Virtlink <notifications@codeplex.com> wrote:

From: Virtlink

Thank you for your feedback. I re-uploaded the file, now with the correct extension. I'll have a look at the license, because I would like to achieve two goals: firstly, I would like to enable all kinds of open source projects (like Cosmos) to use and distribute my library; and second, that it remains something which is unique to the open source community - to make it stronger.

Read the full discussion online.

To add a post to this discussion, reply to this email (Cosmos@discussions.codeplex.com)

To start a new discussion for this project, email Cosmos@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Sep 27, 2011 at 2:28 PM

Now that I really think about it, for it to be fully usable in Cosmos (Due to the fact derivatives of IL2CPU itself are allowed to be Commercial), it would have to be under the BSD license....


@Devs, what do you think about caching the compiled output of the System libraries, as currently we re-compile them every time. What I was thinking was using the checksum of IL2CPU.x86 to determine wether or not to invalidate the cache, because IL2CPU.x86 changing would be the only reason you couldn't use the cached versions. We would only cache methods, and only the methods in the .net framework itself, as we know those don't change.

Sep 27, 2011 at 3:28 PM
Caching requires large compiler changes, which is something we dont want at this moment, we're working on the debugger currently. after next release (which will feature read/write fat support), we're going to do some major work on the compiler.

Why is caching results that important? Currently on my notebook it takes about 8 seconds to do a build of a cosmos kernel. With such fast compilation, it's probably not worth either the effort nor the overhead of doing output caching, as we would have to take into account several things, like the method and type id's, needed for vcalls, generic types (where to put them), and other things. Remember, we are not compiling mscorlib (or any assembly, for that matter) completely, but are scanning to find out which parts we need to compile..



On Tue, Sep 27, 2011 at 3:28 PM, blah38621 <notifications@codeplex.com> wrote:

From: blah38621

Now that I really think about it, for it to be fully usable in Cosmos (Due to the fact derivatives of IL2CPU itself are allowed to be Commercial), it would have to be under the BSD license....


@Devs, what do you think about caching the compiled output of the System libraries, as currently we re-compile them every time. What I was thinking was using the checksum of IL2CPU.x86 to determine wether or not to invalidate the cache, because IL2CPU.x86 changing would be the only reason you couldn't use the cached versions. We would only cache methods, and only the methods in the .net framework itself, as we know those don't change.

Read the full discussion online.

To add a post to this discussion, reply to this email (Cosmos@discussions.codeplex.com)

To start a new discussion for this project, email Cosmos@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Sep 27, 2011 at 5:05 PM

Not all dev-machines come equal, as mine takes 35 seconds to compile the struct-test. (much faster post-IL2CPU now that I have my optimizer working) When we get graphics fully working, the size of kernels will start to grow, and compile time will increase. Also, compare the time it takes to compile the kernel to C#, to the time it takes for IL2CPU to execute. NGen'ing my TestBed, and all it's dependancies, takes less than a second, and that's a lot more code than IL2CPU is processing in a much larger span of time.

As to doing the actual caching, wouldn't the only places that would need to be modified (provided we switch to emitting binary), be where we generate the asm, and where we emit it? I could potentially do this with the current emission method by using a psudo instruction, which emits the cached version of a method instead of an op-code.

Sep 27, 2011 at 5:30 PM
no, there would be large changes necessary. starting with compiling full assemblies, finding a solution to doing generics in this way, change all code to use methodid and typeid placeholders, etc..


On Tue, Sep 27, 2011 at 6:05 PM, blah38621 <notifications@codeplex.com> wrote:

From: blah38621

Not all dev-machines come equal, as mine takes 35 seconds to compile the struct-test. (much faster post-IL2CPU now that I have my optimizer working) When we get graphics fully working, the size of kernels will start to grow, and compile time will increase. Also, compare the time it takes to compile the kernel to C#, to the time it takes for IL2CPU to execute. NGen'ing my TestBed, and all it's dependancies, takes less than a second, and that's a lot more code than IL2CPU is processing in a much larger span of time.

As to doing the actual caching, wouldn't the only places that would need to be modified (provided we switch to emitting binary), be where we generate the asm, and where we emit it? I could potentially do this with the current emission method by using a psudo instruction, which emits the cached version of a method instead of an op-code.

Read the full discussion online.

To add a post to this discussion, reply to this email (Cosmos@discussions.codeplex.com)

To start a new discussion for this project, email Cosmos@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Sep 27, 2011 at 7:32 PM

I'm only talking about caching the compiled methods themselves (and what they depend on), not even entire types.

Sep 27, 2011 at 10:31 PM
> @Devs, what do you think about caching the compiled output of the System
> libraries, as currently we re-compile them every time. What I was

already on the list, but after we do the next compiler refactor.
Sep 28, 2011 at 8:10 AM
orvid: your latest message tells me you're not fully aware of what types are, and how they're used.

A type is nothing more than a collection of methods, and some metadata. That metadata is used by the compiler. For example the number (and size) of all fields, so the newobj instruction "knows" how much ram to allocate for the object instance. another piece of metadata is the method id. this is used in vcall resolution. each callvirt instruction has this number embedded, and also the InitVMT code uses this numbers (and the type id) to build up an inheritance tree telling types and basetypes, and the methods each type implements. This all would need to be changed. We have been thinking a bit about that, but it's quite complicated.

Another thing that's quite hard (or at least hard to do in a high-performance way), is generate some data from an assembly using reflection, and later on read in that data, and get the same reflection elements back again.

Like I said, we're going to work on compiler after the coming release, and likely we wont need some tricks.

One thing for sure: we dont want to touch current compiler more than necessary, as it's more easy to break things than to improve them...

On Tue, Sep 27, 2011 at 11:31 PM, kudzu <notifications@codeplex.com> wrote:

From: kudzu

> @Devs, what do you think about caching the compiled output of the System
> libraries, as currently we re-compile them every time. What I was

already on the list, but after we do the next compiler refactor.

Read the full discussion online.

To add a post to this discussion, reply to this email (Cosmos@discussions.codeplex.com)

To start a new discussion for this project, email Cosmos@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Sep 28, 2011 at 1:54 PM
mterwoord wrote:

Another thing that's quite hard (or at least hard to do in a high-performance way), is generate some data from an assembly using reflection, and later on read in that data, and get the same reflection elements back again.

There are things that have succeeded in doing exactly that. Mono.Cecil for example. (had to point that one out :P)


As to not fully understanding what types are, that's probably correct, and that's why i've not tried to implement caching. I might though, I would just have to do some very strong limiting of the types of methods that could be cached, generics are one type of method that wouldn't be cached, as I'm not sure how we deal with them, so I can't say that it's possible to just read it back from disk and have it work. I would probably only enable it when building without debug info, so that it's only me and one other (who will probably bug me to death if i mess something up :P), that could be effected, as everyone else builds with debug info.

Sep 28, 2011 at 1:59 PM
Everybody else is using the debugger as well....

Anyway, regarding mono's Cecil: we used it in the beginning, and it was slow as hell. reflection is much faster, and it gave us generics support for free. besides that, using cecil means implementing plugs is much harder.

most of the time the compiler takes is happening in the scanner, which would be replaced with full-assembly compilation, but then we're probably going to compile about 100 times as much as we do now, so go figure how long it would take..

orvid, are you on irc by any chance?



On Wed, Sep 28, 2011 at 2:54 PM, blah38621 <notifications@codeplex.com> wrote:

From: blah38621

mterwoord wrote:

Another thing that's quite hard (or at least hard to do in a high-performance way), is generate some data from an assembly using reflection, and later on read in that data, and get the same reflection elements back again.

There are things that have succeeded in doing exactly that. Mono.Cecil for example. (had to point that one out :P)


As to not fully understanding what types are, that's probably correct, and that's why i've not tried to implement caching. I might though, I would just have to do some very strong limiting of the types of methods that could be cached, generics are one type of method that wouldn't be cached, as I'm not sure how we deal with them, so I can't say that it's possible to just read it back from disk and have it work. I would probably only enable it when building without debug info, so that it's only me and one other (who will probably bug me to death if i mess something up :P), that could be effected, as everyone else builds with debug info.

Read the full discussion online.

To add a post to this discussion, reply to this email (Cosmos@discussions.codeplex.com)

To start a new discussion for this project, email Cosmos@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Sep 28, 2011 at 2:18 PM

Unfortunately, no I don't have access to the IRC atm, give me about 6 hours, and I'll be on the IRC. (also, i said without debug info :P so those using the debugger wouldn't be effected :P)

Sep 28, 2011 at 2:36 PM
Why wouldn't you use the debugger?

On Wed, Sep 28, 2011 at 3:18 PM, blah38621 <notifications@codeplex.com> wrote:

From: blah38621

Unfortunately, no I don't have access to the IRC atm, give me about 6 hours, and I'll be on the IRC. (also, i said without debug info :P so those using the debugger wouldn't be effected :P)

Read the full discussion online.

To add a post to this discussion, reply to this email (Cosmos@discussions.codeplex.com)

To start a new discussion for this project, email Cosmos@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Sep 28, 2011 at 2:44 PM

My dev machine can't handle vmware :P, or, for that matter, any other VM, so I run the os on a different machine.

Sep 28, 2011 at 2:47 PM
howcome it doesn't handle vmware?

a cosmos vm should have enough ram when assigned like 32mb

On Wed, Sep 28, 2011 at 3:44 PM, blah38621 <notifications@codeplex.com> wrote:

From: blah38621

My dev machine can't handle vmware :P, or, for that matter, any other VM, so I run the os on a different machine.

Read the full discussion online.

To add a post to this discussion, reply to this email (Cosmos@discussions.codeplex.com)

To start a new discussion for this project, email Cosmos@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Sep 28, 2011 at 2:50 PM
On 9/28/2011 9:44 AM, blah38621 wrote:
> My dev machine can't handle vmware :P, or, for that matter, any other
> VM, so I run the os on a different machine.

The debugger uses serial, so even on a diff PC just connect by serial.
Sep 28, 2011 at 2:50 PM
On 9/28/2011 9:44 AM, blah38621 wrote:
> My dev machine can't handle vmware :P, or, for that matter, any other
> VM, so I run the os on a different machine.

Ive run VMWare on PCs with < 1 GHz and only 256 RAM...
Sep 28, 2011 at 3:00 PM

But with Visual Studio Pro (and Opera, about 100mb of ram, if more than that is paged, is looses connection to the IRC (provided it's paged longer than a few minutes)) running at the same time? (My dev machine has a 1.2ghz Celeron, and 756mb of ram :P (not to mention the 40gb harddrive :P) ) As to the serial, I test using QEMU running on my server (which runs Ubuntu Linux), so I'm not sure how well that would work :P

(As to how I move the iso's between the computers, I have a Samba share setup which is connected to all of my computers)