Python Answers Forum - The size of a data type

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Python Answers Forum
The size of a data type

9 replies on 1 page. Most recent reply: Sep 1, 2004 1:11 AM by Greg Jorgensen

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 9 replies on 1 page

Alexandru Ruse

Posts: 6
Nickname: jfrosty
Registered: Jul, 2003

The size of a data type

Posted: Aug 27, 2004 12:17 PM

How do I get the size, in bytes, of an object in Python? Just the way sizeof() does in C. Well, in C it's compile time, in Python it will be run-time, but that doesn't matter.

Kondwani Mkandawire

Posts: 530
Nickname: spike
Registered: Aug, 2004

Re: The size of a data type

Posted: Aug 28, 2004 2:35 PM

I also found this of particular interest.
I ran a search on Google and it doesn't seem
like it is possible. I got this
http://www.egenix.com/mailman-archives/egenix-users/2002-November/000158.html resulting from the Search
"sizeof Python". However, I used Python
a long while back in a CS course and if I remember
correctly there are ways to incorporate C Libraries
as extensions to Python in which case you would be
able to use lib.h or whatever library you require
for your function. I don't remember the specifics,
you could probably look that up.

Kondwani Mkandawire

Posts: 530
Nickname: spike
Registered: Aug, 2004

Re: The size of a data type

Posted: Aug 28, 2004 2:42 PM

Sorry again.

Apparently the library you want to be loading
is ctypes (run a Search on how to do this).

http://starship.python.net/crew/theller/ctypes/tutorial.html

Good luck.

Spike

Alexandru Ruse

Posts: 6
Nickname: jfrosty
Registered: Jul, 2003

Re: The size of a data type

Posted: Aug 28, 2004 4:40 PM

Thanks, I'll look into it. I was just curious if it can be done, I don't really need it. I started wondering about it when porting some C code which had something like:


typedef unsigned char byte;
struct Machine
{
    byte memory[255];
    byte PC, IR;
};

It emulated a very simple computer, and the memory was represented as an array of 255 bytes. It relied on the fact that char is one byte in C. So I was thinking what kind of data type would be exactly one byte in Python. Anyways...

Matt Gerrans

Posts: 1153
Nickname: matt
Registered: Feb, 2002

Re: The size of a data type

Posted: Aug 29, 2004 1:10 PM

My guess is that if you port the code to Python, the size of the structure, or of elements of an array inside it will become irrelevant. Since you don't do pointer arithmetic in Python, as is common in C, the size of things is not as important. So, for example, in your case, you might have a list of 255 memory "locations," each of which could contain a character.

In other words, when you port a C program to Python, you also need to port the C idioms to Python idioms.

Greg Jorgensen

Posts: 65
Nickname: gregjor
Registered: Feb, 2004

Re: The size of a data type

Posted: Aug 29, 2004 9:25 PM

Python doesn't have statically-typed variables like C, so there's no analog to C's sizeof(). You can port the code to Python easily enough, you just can't expect to simply translate the syntax. Take your example:


typedef unsigned char byte;
struct Machine
{
    byte memory[255];
    byte PC, IR;
};

Python doesn't have typedef or struct, either. You can make a similar data structure in several ways. A simple class, for example:


class Machine:
    def __init__():
        self.PC = 0
        self.IR = 0
        self.memory = [0] * 256

You would then add methods to limit the values stored in PC and IR to 0..255, and to check that only unsigned char values (0..255) get stored in the memory array.

C's idioms are great when you're writing in C, but they don't translate to Python.

Greg Jorgensen
PDXperts LLC - Portland, Oregon USA

Alexandru Ruse

Posts: 6
Nickname: jfrosty
Registered: Jul, 2003

Re: The size of a data type

Posted: Aug 30, 2004 1:12 PM

I understand. It was more of a curiosity thing, anyway. In fact, I've been programming in Python for a while now and I never saw the need for "C tricks" up until now. However, I don't agree with the following:

> Python doesn't have statically-typed variables like C, so
> there's no analog to C's sizeof().

There is no reason we shouldn't be able to determine the size, in bytes, of an object in Python. In C, sizeof() is a compile-time expression (you can even use it in an array dimension declaration, so it's obviously constant). In a dynamic language like Python, it should be run-time, but I see no reason why it couldn't be done. After all, we have id(obj), which, according to the docs:

> Return the `identity' of an object. This is an integer (or
> long integer) which is guaranteed to be unique and
> constant for this object during its lifetime. Two objects
> whose lifetimes are disjunct may have the same id() value.
> (Implementation note: this is the address of the object.)

So, if we can get the starting address, it shouldn't be that hard to implement sizeof(). And i think it could actually be useful sometimes. For example, in the tutorial:
http://nehe.gamedev.net/data/lessons/lesson.asp?lesson=31
he loads a MilkShape3D model from a file. Such a file begins with a structure (he names it MS3DHeader) containing version info. A Python file object supports the method read([size]), which gets the a specified number of bytes from the file. Now, if I declare a Python class MS3DHeader, how can I make sure it's going to be the exact same size as a struct MS3DHeader? Otherwise I can't read from the file properly. And there are more complex ones too.


// File header
struct MS3DHeader
{
	char m_ID[10];
	int m_version;
} PACK_STRUCT;

// Vertex information
struct MS3DVertex
{
	byte m_flags;
	float m_vertex[3];
	char m_boneID;
	byte m_refCount;
} PACK_STRUCT;

// Triangle information
struct MS3DTriangle
{
	word m_flags;
	word m_vertexIndices[3];
	float m_vertexNormals[3][3];
	float m_s[3], m_t[3];
	byte m_smoothingGroup;
	byte m_groupIndex;
} PACK_STRUCT;

// Material information
struct MS3DMaterial
{
    char m_name[32];
    float m_ambient[4];
    float m_diffuse[4];
    float m_specular[4];
    float m_emissive[4];
    float m_shininess;	// 0.0f - 128.0f
    float m_transparency;	// 0.0f - 1.0f
    byte m_mode;	// 0, 1, 2 is unused now
    char m_texture[128];
    char m_alphamap[128];
} PACK_STRUCT;

Matt Gerrans

Posts: 1153
Nickname: matt
Registered: Feb, 2002

Re: The size of a data type

Posted: Aug 31, 2004 8:13 AM

Have you seen the struct module? That lets you work with binary data for certain of the built-in data types that have known sizes. This would allow Python to read binary files written by C programs and vice versa.

However, if you are writing and reading the file only with Python, you'd probably use serialization with Pickle or something similar. Otherwise, if you are doing it C style, you have to come up with your own tricks to write lists, dictionaries, etc.

This doesn't answer your question about dynamically getting the size, though. I've seen this question answered on comp.lang.python a few times and never answered. I think it usually ends with a suggestion that the interested party would need to delve into the Python source. While a few Python types are simple (like int, char, etc.), most are not (like list, tuple, long, etc.) and it would require some work to figure out the size of the latter. Of course, even the size of the former would be platform-dependent. For this reason, I think it is philosophically more sensible for Python not to offer this size information. A lot of programmers coming straight from C would be comfortable using it, even though it isn't really needed and create Python code that would be unnecessarily brittle. It is better just to get used to no longer needing that kind of detail. (In the cases when it is necessary, because of working with existing binary formats, the struct module or C extensions will come to the rescue).

Alexandru Ruse

Posts: 6
Nickname: jfrosty
Registered: Jul, 2003

Re: The size of a data type

Posted: Aug 31, 2004 5:12 PM

Thanks, didn't know about the struct module. I'll look into it. We definitely need a way to read binary files generated by C/C++/Delphi/etc, because most file formats of commercial applications are that way. If you are lucky enough to work with a documented file format, the code samples are still in C/C++. And it's just weird to write lots of Python code, even C extensions (!), to achieve a very common thing, which is done in C with a line of code:


read(filePtr, &myStruct, sizeof(myStruct));

assumming a function read(FILE*, void*, size_t) - replace with your favourite I/O library :)

Greg Jorgensen

Posts: 65
Nickname: gregjor
Registered: Feb, 2004

Re: The size of a data type

Posted: Sep 1, 2004 1:11 AM

> However, I don't agree with the following:
>
> > Python doesn't have statically-typed variables like C,
> > so there's no analog to C's sizeof().
>
> There is no reason we shouldn't be able to determine the
> size, in bytes, of an object in Python. In C, sizeof() is
> a compile-time expression (you can even use it in an array
> dimension declaration, so it's obviously constant). In a
> dynamic language like Python, it should be run-time, but I
> see no reason why it couldn't be done.

I didn't say there was no need for it; I said there is no analog to it. But I don't think there's any need for sizeof() in Python either. In C, sizeof is used (a) to make code portable across hardware platforms, and (b) for pointer arithmetic. Neither of those purposes have analogs in Python code.

The example you gave, reading binary data from a file into a C struct, seems to be a third use, but it's actually just an example of pointer arithmetic.

Obviously someone could implement a runtime sizeof() function that returned the number of bytes occupied by an object at that time, but I don't think it would be useful in Python. You can't use that information in the way C uses it. In the context of C sizeof() is a necessary construct; in the context of Python it's at best useless and at worst misleading.

All that said, you can find some answers in the O'Reilly book "Python In A Nutshell." Page 34:

"A plain integer takes up a few bytes of memory and has minimum and maximum values that are dictated by machine architecture. sys.maxint is the largest plain integer, while -sys.maxint-2 is the largest negative one. On typical 32-bit machines, sys.maxint is 2147483647."

The Python long integer type occupies at least as much memory necessary to store the value. Python automatically promotes plain integers to long integers, but not from long to plain. Long integers would need at least enough log2(n)+1 bits, most likely padded out to the next 32-bit boundary. But what use is this information? How would you use it?

According to the same book, "A Python floating-point value corresponds to a C double and share its limits of range and precision. ... Python currently offers no way to find out this range and precision."

Strings would have to occupy at least one byte per character, but since Python strings can be Unicode rather than ASCII, even predicting the number of bytes occupied by a string of length N gets tricky.

C and Python actually have few data types in common: one size of integer and one size of floating-point.

As another poster pointed out, there are Python libraries that can deal with mapping structures to binary data and files. And if you need to write binary data Python can easily do that, too. But Python works at a higher level of abstraction than C, at a level where mapping structures to files byte-by-byte, worrying about alignment and endian-ness, etc. are details intentionally hidden from the Python programmer. If you need to operate at that level -- just above the raw machine language -- C is your best choice.

Greg Jorgensen
PDXperts LLC - Portland, Oregon, USA

Flat View: This topic has 9 replies on 1 page

Previous Topic

Next Topic