[Home]  [Edit this page]  [Recent Changes]  [Special Pages]  [Help
x86asmfaq_general
What is x86 Assembly?

x86 assembly refers to the assembly language for the Intel 80x86 processor family and compatibles (including the Pentium and above.) Assembly languages in general are just mnemonic names for machine code, and as such are effectively the lowest level languages in existence. Somewhat counter-intuitively, that makes assembly the most powerful language in existence. If it can't be written in assembly, it can't be written. Nevertheless should you use assembly?

Should I use assembly?

If you are asking this question then the answer is almost certainly an emphatic NO!!! There are a few situations where assembly is required, but by and far most application programmers will never need to know it, and in the vast majority of situations it would be a poor idea to use it. Now, let's rephrase this question: should I learn assembly? Well, that is certainly a different question, and the answer to that is yes. Learning any powerful general purpose assembly will give you a good understanding of how computers and compilers work and WILL result in your writing better code and understanding what your code does. For this purpose, x86 assembly isn't necessary, though it certainly is a powerful general purpose assembly, however you are most likely using an x86 processor so x86 is most relevant to you. If you aren't, then perhaps you should try the assemblyof your processor. If you aren't and still want x86 then see the resources.

Where can I learn x86 Assembly?

As always, a college course is probably the best answer. However, let's stick to things you can find on the Internet and/or books.

The Art of Assembly (AoA) is probably the best and most popular resource that is available freely over the Internet. Located at http://webster.cs.ucr.edu/, it is a college-level textbook that starts at the basics and covers everything you'll need to know plus just some very interesting and advanced topics. As of the writing of this there are three editions: one for DOS, one for Windows, and one for Linux. My recommendation is to read the DOS one, then the Windows/Linux one (they're almost identical), even if you don't intend on ever writing DOS code. It contains some information the others don't and some low-level details that aren't applicable in a protected mode application, but that may still be useful to know for drivers.

Where can I get an assembler?

An assembler is to assembly as a compiler is to a high-level language. The difference in naming is actually somewhat arbitrary and every now and then you'll here an assembler referred to as a compiler.

The most popular assemblers now are:
  • NASM (The Netwide Assembler)
  • MASM (Microsoft Macro Assembler)
  • TASM (Turbo Assembler)
  • GAS (The GNU Assembler)
  • FASM (The flat assembler)
  • and others
NASM is free, open-source, and available for most platforms.
MASM is free and available for 32-bit Windows. Older versions target DOS.
TASM is not free (though I believe it comes with the trial download of CBuilder) and it targets 32-bit Windows. Older versions target DOS.
GAS is free, open-source, and availabe for most platforms.
FASM is free, has an open source code, and is avalible for Linux and Windows. It is still under development, but stable, fast and efficient. (About the syntax FASM uses Tomasz Grysztar (the author of FASM) says (in the Win32ASM Community messageboard) that ''"The entire FASM syntax is based on the TASM's IDEAL mode (with some influence of NASM)"'')

If you are just starting, my recommendation is NASM; however, most examples you'll see on the web are written in MASM or in TASM in MASM-compatibilty mode. Ideally, the code listings in this FAQ will be relatively independent of assembler, or will contain different listings for each assembler, though that may take time.

What is AT&T and Intel Syntax?

AT&T syntax is a syntax that is friendlier to compilers, but most think less friendly to people. In AT&T syntax the size of the parameters are coded into the opcode name (e.g. movl for mov long) and registers are decorated (or perhaps mangled is a better word) with %'s and other manglings occur. The result is lines of assembly that are completely unambiguous and near impossible to subtly change, but that are a nightmare to read and write, and that are full of redundant information.

Intel syntax is much friendlier. Most of the time it can accurately deduce the correct opcodes to use by the operands, and in the very few instances it can't it knows and will issue and error. Intel syntax is a lot less "busy" than AT&T syntax. However, one issue people have with Intel syntax is that it seems to be backwards. Here's an example that illustrates the "problem" and compares AT&T syntax and Intel syntax:

AT&T:
    movl %eax, %ebx


Intel:
    mov ebx, eax


These two examples accomplish the same thing, which is (did you guess correctly?) copying the value in the eax register to the ebx register.

The examples in this FAQ should all use Intel syntax and so should the code listings unless the assembler targetted uses AT&T syntax. The only assembler of the four most popular ones that does is GAS and it also supports Intel syntax.

What is 16-bit/real-mode and 32-bit/protected-mode?

Starting with the 80386, the Intel processor supported 32-bit protected mode. Protected-mode allowed full access to memory, multitasking, and security to be implemented in operating systems. Before that there was 16-bit real-mode. (There is actually a 16-bit protected-mode, but it is never used.) 16-bit real-mode only allows one to access 1 megabyte of memory at a time, and had a confusing and limiting segmentation system. It was also impossible to properly implement multitasking and security in real-mode. The only "popular" OS that support real-mode is DOS. For that reason, and because application-level protected-mode programming is simpler, most examples and code listings should assume they are in protected-mode if there is no reason not to. If they use real-mode they should clearly state it.

What is inline assembly?

Inline assembly is where a higher-level language allows you to insert assembly code in the source file of the higher-level language. How you go about that is completely compiler and language specific and beyond the scope of this question, and would probably be more appropriately answered in the FAQ for that language/compiler. As far as the assembly is concerned, the opcodes are fairly fixed (there are some minor differences between AT&T syntax and Intel, even beyond a size character), and they obviously mean the same thing.

last edited (July 25, 2003) by scientica, Number of views: 6672, Current Rev: 4 (Diff)

[Edit this page]  [Page history]  [What links here]  [Discuss this topic]  [Printer Friendly]  

Members

Username:

Password:


Register
Forgot Password?




Programmers Heaven - for .NET, Java, C/C++ and WEB Developers!
© 1996-2008 Community Networks Ltd. All rights reserved. Reproduction in whole or in part, in any form or medium without express written permission is prohibited. Violators of this policy may be subject to legal action. Please read Terms Of Use and Privacy Statement for more information. Development by Tore Nestenius at .NET Consultant - Synchron Data.