From lethal at linux-sh.org Sat Dec 1 00:29:50 2007 From: lethal at linux-sh.org (Paul Mundt) Date: Sat, 1 Dec 2007 17:29:50 +0900 Subject: [PATCH]: ARM SHMLBA In-Reply-To: <475112CD.4020305@gmail.com> References: <474234E8.60803@mvista.com> <47428CFC.2000504@st.com> <20071120081225.GA16864@linux-sh.org> <475112CD.4020305@gmail.com> Message-ID: <20071201082950.GA16640@linux-sh.org> Hi Carmelo, On Sat, Dec 01, 2007 at 08:52:45AM +0100, Carmelo Amoroso wrote: > Paul Mundt wrote: > >On Tue, Nov 20, 2007 at 08:30:04AM +0100, Carmelo AMOROSO wrote: > >>Same issue as SH4 (solved in the past). > > > >"Solved" is relative. What's there now works, but it's a pretty idiotic > >hack, and is certainly not solved for the multiple page size cases. The > >current implementation is certainly sub-optimal for 8kB (SH-X2 and later) > >and 64kB PAGE_SIZE (SH-4, SH-5, SH-X2 and later, etc.), and we really do > >need to take the L1 shape in to account for handling this properly. > > > >So it does work, but it will spread things out far more than they need to > >be. Most of the information needed for fixing this properly can be > >extracted from the ELF auxvt, though I never quite got around to > >finishing up the code for that. > > > do you mean passing the shm_alignment value from kernel > to ld.so through the auxvt ? if so, I could try to provide > a patch for the kernel and then update th ld.so to take care of this. > What I meant was using the same math for calculating shm_align_mask in userspace based on the L1 D-cache shape. The auxvt has pre-supported cache shape entries that we can use for populating with the cache info and support in a semi-portably fashion on the userspace side. If I recall correctly, it was Alpha that added these initially. So basically I would like to see AT_Lx_CACHESHAPE used for working this out. The biggest issue is that we need a bit of leg-work in uClibc for parsing the auxvt in the first place. I had some code hacked together for that, but never got around to polishing it off. We need this as a base step for hooking up the vdso entry as well -- which I suspect might be of interest to you guys especially since you can do away with the context switch overhead on your sys_cacheflush ;-) From A.Elch at gmx.at Sat Dec 1 03:36:56 2007 From: A.Elch at gmx.at (Andreas Erler) Date: Sat, 1 Dec 2007 12:36:56 +0100 Subject: Crosscompiler on cygwin References: <000801c82f5f$7d408c80$7764a8c0@ELCH> <475112EF.20608@gmail.com> Message-ID: <000f01c8340e$7abca8f0$7764a8c0@ELCH> Hello Carmelo, thanks for your answer. Yes I've tried different things and also disabling mudflap but nothing want really work. But the good news is that I've switched to another pc installed cygwin from scratch and build gcc4.2.2 for the cygwin system and after that I was able to build the cross toolchain from armeb. Thanks a lot, Andy ----- Original Message ----- From: "Carmelo Amoroso" To: "Andreas Erler" Cc: Sent: Saturday, December 01, 2007 8:53 AM Subject: Re: Crosscompiler on cygwin > Andreas Erler wrote: >> Hello, >> I'm trying to build a cross compiler from i686-pc-cywin to >> armeb-linux-ulibc with buildroot. >> When the build systems configures for the libmudflap in the directory >> gcc-4.2.1-final/armeb-linux-uclibc/ >> it spit out the following lines when: >> >> checking for dlsym in -ldl... >> >> error: none of the known symbol names works >> In the config.log I found the following lines for this error: >> configure:8533: >> /usr/src/buildroot/toolchain_build_armeb/gcc-4.2.1-final/./gcc/xgcc -B/usr/src/buildroot/toolchain_build_armeb/gcc-4.2.1-final/./gcc/ >> -B/usr/armeb-linux-uclibc/bin/ -B/usr/armeb-linux-uclibc/lib/ -isystem >> /usr/armeb-linux-uclibc/include -isystem >> /usr/armeb-linux-uclibc/sys-include -o conftest -g -Os >> conftest.c -lpthread >&5 >> collect2: ld terminated with signal 11 [Segmentation fault], core dumped >> /usr/src/buildroot/build_armeb/staging_dir/usr/armeb-linux-uclibc/bin/ld: >> /ecos-c/DOKUME~1/ELCH/LOKALE~1/Temp/cceuZPSX.o: invalid string offset >> 16777216 >= 40 for section `.strtab' >> configure:8539: $? = 1 >> Does anyone have an idea what this means? Is the linker broken? >> Thanks in advance for any help >> Andy >> >> > Have you tried disabling libmudflap? > > Carmelo >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> uClibc mailing list >> uClibc at uclibc.org >> http://busybox.net/cgi-bin/mailman/listinfo/uclibc > From bernds_cb1 at t-online.de Sat Dec 1 05:43:52 2007 From: bernds_cb1 at t-online.de (Bernd Schmidt) Date: Sat, 01 Dec 2007 14:43:52 +0100 Subject: FD-PIC patches for uClibc In-Reply-To: <475112F5.4040305@gmail.com> References: <474981B6.9070504@t-online.de> <029a01c82f72$2dd489e0$5267a8c0@Jocke> <4749F02C.3080706@t-online.de> <1196079962.12625.53.camel@gentoo-jocke.transmode.se> <474C0849.2050406@t-online.de> <2ccd6e3c0711270927u565b87ect618a6b2ae8d4000e@mail.gmail.com> <474EA553.1090100@t-online.de> <475112F5.4040305@gmail.com> Message-ID: <47516518.1010101@t-online.de> Carmelo Amoroso wrote: > with the solution proposed into the nptl branch, you can keep the caller > of _dl_find_hash always the same. Where the extra tpnt parameter is not > required, > independently from TLS or FDPIC code, like into ldso.c to lookup > some function (malloc), you can safely pass NULL for it. > In all other cases, you need simply to pass the extra tpnt, and the > _dl_find_hash > wrapper will pass it to the real function _dl_lookup_hash accordingly. > This has been discussed in the past with Jocke to allow a simpler merge > among > trunk and branch. To be honest, I don't see how this is simpler in any way than keeping _dl_find_hash the way it is now, and only changing the function calls which need the new functionality to call _dl_lookup_hash. Seems like there would be less merge effort? > Hopefully I could commit them next week... not access to svn right now. I have commit access. If we can get consensus I can apply them. Bernd -- This footer brought to you by insane German lawmakers. Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif From vda.linux at googlemail.com Sun Dec 2 02:44:43 2007 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sun, 2 Dec 2007 02:44:43 -0800 Subject: [PATCH] realpath stack usage 12k -> 8k Message-ID: <200712020244.43454.vda.linux@googlemail.com> Hi, Currently busybox's biggest stack user (per-function) is actually in uclibc, not busybox. It's realpath(). Proposed patch uses user-supplied buffer directly, without intermediate on-stack copy. This can only make a difference if user supplied a buffer which is too small - thus user breaks API. Failure scenario: realpath("/link_name", user_buffer) /link_name -> /very_long_name_which_fits_into_PATH_MAX_and_is_also_a_link -> -> /shorter_name If user will give e.g. 40-char user_buffer, current implementation will work, patched one will overflow user_buffer by intermediate name. This should not be a problem - user must supply PATH_MAX sized buffer, and in this case patched version also works correctly. Run tested. ACK, anyone? -- vda -------------- next part -------------- A non-text attachment was scrubbed... Name: realpath.diff Type: text/x-diff Size: 2209 bytes Desc: not available Url : http://busybox.net/lists/uclibc/attachments/20071202/58543ce0/attachment.bin From vda.linux at googlemail.com Sun Dec 2 13:48:23 2007 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sun, 2 Dec 2007 13:48:23 -0800 Subject: [PATCH] realpath stack usage 8k -> 4k In-Reply-To: <200712020244.43454.vda.linux@googlemail.com> References: <200712020244.43454.vda.linux@googlemail.com> Message-ID: <200712021348.23252.vda.linux@googlemail.com> On Sunday 02 December 2007 02:44, Denys Vlasenko wrote: > Proposed patch uses user-supplied buffer directly, > without intermediate on-stack copy. > This can only make a difference if user supplied > a buffer which is too small - thus user breaks API. > > Failure scenario: > > realpath("/link_name", user_buffer) > > /link_name -> /very_long_name_which_fits_into_PATH_MAX_and_is_also_a_link > -> -> /shorter_name > > If user will give e.g. 40-char user_buffer, current implementation > will work, patched one will overflow user_buffer by intermediate name. > > This should not be a problem - user must supply PATH_MAX sized buffer, > and in this case patched version also works correctly. And the following patch on top of previous one reuses copy_buf[] for readlink, eliminating link_buf[]. In order to make it work, "source" pathname is kept at the end of copy_buf, not at the beginning (so that last NUL byte is the last byte of the copy_buf[]). The situation when readlink returns link name which is too long (so that it overwrites pathname), was resulting in ENAMETOOLONG error return. This patch does the same - the fact the we now trash pathname does not matter, as we are not returning it to the user. Run tested. Can somebody review these patches please? -- vda -------------- next part -------------- A non-text attachment was scrubbed... Name: realpath2.diff Type: text/x-diff Size: 2455 bytes Desc: not available Url : http://busybox.net/lists/uclibc/attachments/20071202/c432096d/attachment.bin From carmelo73 at gmail.com Mon Dec 3 13:06:27 2007 From: carmelo73 at gmail.com (Carmelo Amoroso) Date: Mon, 03 Dec 2007 22:06:27 +0100 Subject: [PATCH] Always inline system calls In-Reply-To: <1196237053.10698.14.camel@localhost.localdomain> References: <1193673669.8359.28.camel@localhost.localdomain> <1195224317.8050.23.camel@localhost.localdomain> <473DBAC9.2070002@st.com> <20071119103030.7b594215@dhcp-255-175.norway.atmel.com> <474BC7BE.2030802@atmel.com> <1196150050.6777.13.camel@gentoo-jocke.transmode.se> <1196237053.10698.14.camel@localhost.localdomain> Message-ID: <47546FD3.3060003@gmail.com> Hans-Christian Egtvedt wrote: > On Tue, 2007-11-27 at 08:54 +0100, Joakim Tjernlund wrote: >> On Tue, 2007-11-27 at 08:31 +0100, Hans-Christian Egtvedt wrote: >>> Haavard Skinnemoen wrote: >>>> On Fri, 16 Nov 2007 16:44:09 +0100 >>>> Carmelo AMOROSO wrote: >>>> >>>>> Just an idea... why not redefining 'inline' into >>>>> ldso/ldso/avr32/dl-syscalls.h that is included >>>>> at the top of ldso/include/dl-syscall.h. >>>>> This will not affect any other architectures, letting >>>>> the compiler to do the best choice ? >>>> Are you sure no other architectures need this? gcc tends to get >>>> extremely reluctant about inlining when compiling with -Os, and I have >>>> a hard time believing that avr32 is the only architecture that can't >>>> call functions before the GOT has been initialized. >>>> >>>> Actually, we probably can call functions before the GOT has been set up >>>> if ldso is compiled with enough optimization, but I think it's more >>>> robust to just make sure that the functions that are called early are >>>> always inlined. >>>> >>> This thread died, I can always make a patch for doing the inline stuff >>> only AVR32 specific. At least if others are seeing similar problems. For >>> example Buildroot will compile uClibc with -Os. >> I think always inline is better. As I recall there are some syscalls >> that must be inline or ldso will break for most archs >> > > Are there any architectures which break because of always inlining? > > Are there a huge difference in the binary size because of always inline? > Merged, cheers. Carmelo From carmelo73 at gmail.com Mon Dec 3 13:33:39 2007 From: carmelo73 at gmail.com (Carmelo Amoroso) Date: Mon, 03 Dec 2007 22:33:39 +0100 Subject: [PATCH] Always inline system calls In-Reply-To: <47546FD3.3060003@gmail.com> References: <1193673669.8359.28.camel@localhost.localdomain> <1195224317.8050.23.camel@localhost.localdomain> <473DBAC9.2070002@st.com> <20071119103030.7b594215@dhcp-255-175.norway.atmel.com> <474BC7BE.2030802@atmel.com> <1196150050.6777.13.camel@gentoo-jocke.transmode.se> <1196237053.10698.14.camel@localhost.localdomain> <47546FD3.3060003@gmail.com> Message-ID: <47547633.6010307@gmail.com> Carmelo Amoroso wrote: > Hans-Christian Egtvedt wrote: >> On Tue, 2007-11-27 at 08:54 +0100, Joakim Tjernlund wrote: >>> On Tue, 2007-11-27 at 08:31 +0100, Hans-Christian Egtvedt wrote: >>>> Haavard Skinnemoen wrote: >>>>> On Fri, 16 Nov 2007 16:44:09 +0100 >>>>> Carmelo AMOROSO wrote: >>>>> >>>>>> Just an idea... why not redefining 'inline' into >>>>>> ldso/ldso/avr32/dl-syscalls.h that is included >>>>>> at the top of ldso/include/dl-syscall.h. >>>>>> This will not affect any other architectures, letting >>>>>> the compiler to do the best choice ? >>>>> Are you sure no other architectures need this? gcc tends to get >>>>> extremely reluctant about inlining when compiling with -Os, and I have >>>>> a hard time believing that avr32 is the only architecture that can't >>>>> call functions before the GOT has been initialized. >>>>> >>>>> Actually, we probably can call functions before the GOT has been >>>>> set up >>>>> if ldso is compiled with enough optimization, but I think it's more >>>>> robust to just make sure that the functions that are called early are >>>>> always inlined. >>>>> >>>> This thread died, I can always make a patch for doing the inline >>>> stuff only AVR32 specific. At least if others are seeing similar >>>> problems. For example Buildroot will compile uClibc with -Os. >>> I think always inline is better. As I recall there are some syscalls >>> that must be inline or ldso will break for most archs >>> >> >> Are there any architectures which break because of always inlining? >> >> Are there a huge difference in the binary size because of always inline? >> > Merged, cheers. > Carmelo > Hello, while doing some test for SH4 to measure size increase for 'always inline' changes, doscovered suddenly that gcc-4.1.1 (cross sh4) fails with the following error: ../ldso/ldso/dl-elf.c: In function '_dl_dprintf': ../ldso/ldso/dl-elf.c:858: error: unable to find a register to spill in class 'R0_REGS' ../ldso/ldso/dl-elf.c:858: error: this is the insn: (insn 916 917 24 1 (set (reg/f:SI 1 r1 [219]) (mem/u/c:SI (plus:SI (reg:SI 12 r12) (reg/f:SI 1 r1 [220])) [0 S4 A32])) 172 {movsi_ie} (nil) (expr_list:REG_DEAD (reg/f:SI 1 r1 [220]) (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) (nil)))) either running with -Os or -O0. I'll test tomorrow with gcc-4.2.1 to see if it makes difference, otherwise I suspect we should go back on my proposal in using always inline only for arch strictly requiring it. Carmelo From kraj at mvista.com Mon Dec 3 13:50:56 2007 From: kraj at mvista.com (Khem Raj) Date: Mon, 03 Dec 2007 13:50:56 -0800 Subject: [PATCH] Always inline system calls In-Reply-To: <47547633.6010307@gmail.com> References: <1193673669.8359.28.camel@localhost.localdomain> <1195224317.8050.23.camel@localhost.localdomain> <473DBAC9.2070002@st.com> <20071119103030.7b594215@dhcp-255-175.norway.atmel.com> <474BC7BE.2030802@atmel.com> <1196150050.6777.13.camel@gentoo-jocke.transmode.se> <1196237053.10698.14.camel@localhost.localdomain> <47546FD3.3060003@gmail.com> <47547633.6010307@gmail.com> Message-ID: <47547A40.7000909@mvista.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Carmelo Amoroso wrote: > while doing some test for SH4 to measure size increase for 'always inline' changes, > doscovered suddenly that gcc-4.1.1 (cross sh4) fails with the following error: > > ../ldso/ldso/dl-elf.c: In function '_dl_dprintf': > ../ldso/ldso/dl-elf.c:858: error: unable to find a register to spill in class 'R0_REGS' > ../ldso/ldso/dl-elf.c:858: error: this is the insn: > (insn 916 917 24 1 (set (reg/f:SI 1 r1 [219]) > (mem/u/c:SI (plus:SI (reg:SI 12 r12) > (reg/f:SI 1 r1 [220])) [0 S4 A32])) 172 {movsi_ie} (nil) > (expr_list:REG_DEAD (reg/f:SI 1 r1 [220]) > (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) > (nil)))) > > either running with -Os or -O0. It will be nice if you could reduce the testcase and report this problem in gcc bugzilla. > I'll test tomorrow with gcc-4.2.1 to see if it makes difference, > otherwise I suspect we should go back on my proposal in using always inline > only for arch strictly requiring it. > > Carmelo > _______________________________________________ > uClibc mailing list > uClibc at uclibc.org > http://busybox.net/cgi-bin/mailman/listinfo/uclibc - -- Khem Raj MontaVista Software Inc. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHVHpAUjbQJxVzeZQRAu/fAKCYlUCFUF8askPxhf9qhx4w6OYOVwCdE6Wq i2blmWrXZaRbuGVYIpAQB2w= =Ldww -----END PGP SIGNATURE----- From lethal at linux-sh.org Mon Dec 3 14:07:47 2007 From: lethal at linux-sh.org (Paul Mundt) Date: Tue, 4 Dec 2007 07:07:47 +0900 Subject: [PATCH] Always inline system calls In-Reply-To: <47547633.6010307@gmail.com> References: <1193673669.8359.28.camel@localhost.localdomain> <1195224317.8050.23.camel@localhost.localdomain> <473DBAC9.2070002@st.com> <20071119103030.7b594215@dhcp-255-175.norway.atmel.com> <474BC7BE.2030802@atmel.com> <1196150050.6777.13.camel@gentoo-jocke.transmode.se> <1196237053.10698.14.camel@localhost.localdomain> <47546FD3.3060003@gmail.com> <47547633.6010307@gmail.com> Message-ID: <20071203220747.GA14446@linux-sh.org> On Mon, Dec 03, 2007 at 10:33:39PM +0100, Carmelo Amoroso wrote: > Carmelo Amoroso wrote: > while doing some test for SH4 to measure size increase for 'always inline' changes, > doscovered suddenly that gcc-4.1.1 (cross sh4) fails with the following error: > > ../ldso/ldso/dl-elf.c: In function '_dl_dprintf': > ../ldso/ldso/dl-elf.c:858: error: unable to find a register to spill in class 'R0_REGS' > ../ldso/ldso/dl-elf.c:858: error: this is the insn: > (insn 916 917 24 1 (set (reg/f:SI 1 r1 [219]) > (mem/u/c:SI (plus:SI (reg:SI 12 r12) > (reg/f:SI 1 r1 [220])) [0 S4 A32])) 172 {movsi_ie} (nil) > (expr_list:REG_DEAD (reg/f:SI 1 r1 [220]) > (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) > (nil)))) > > either running with -Os or -O0. > I'll test tomorrow with gcc-4.2.1 to see if it makes difference, > otherwise I suspect we should go back on my proposal in using always inline > only for arch strictly requiring it. > We've noticed this with some versions in buildroot also, so it seems to be a more common issue: CC ldso/ldso/ldso.oS In file included from ./libpthread/linuxthreads.old/sysdeps/sh/tls.h:23, from ./include/bits/uClibc_errno.h:35, from ./include/errno.h:62, from ./include/bits/syscalls.h:16, from ./include/sys/syscall.h:34, from ./ldso/ldso/sh/dl-syscalls.h:3, from ./ldso/include/dl-syscall.h:12, from ./ldso/include/ldso.h:36, from ldso/ldso/ldso.c:33: ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: C99 inline functions are not supported; using GNU89 ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: to disable this warning use -fgnu89-inline or the gnu_inline function attribute ldso/ldso/dl-elf.c: In function '_dl_dprintf': ldso/ldso/dl-elf.c:803: error: unable to find a register to spill in class 'R0_REGS' ldso/ldso/dl-elf.c:803: error: this is the insn: (insn 884 885 23 3 (set (reg/f:SI 1 r1 [221]) (mem/u/c:SI (plus:SI (reg:SI 12 r12) (reg/f:SI 1 r1 [222])) [0 S4 A32])) 171 {movsi_ie} (nil) (expr_list:REG_DEAD (reg/f:SI 1 r1 [222]) (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) (nil)))) ldso/ldso/dl-elf.c:803: confused by earlier errors, bailing out make[1]: *** [ldso/ldso/ldso.oS] Error 1 So inlining itself may not really be a problem, but it might be worthwhile hunting down the code that generates the immediate load and seeing if that can be forced in to a memory access instead, so we avoid the R0 encoding dependence. These are all in relation to _dl_pagesize at least. From bernds_cb1 at t-online.de Mon Dec 3 18:17:07 2007 From: bernds_cb1 at t-online.de (Bernd Schmidt) Date: Tue, 04 Dec 2007 03:17:07 +0100 Subject: FD-PIC patches for uClibc In-Reply-To: <475112F5.4040305@gmail.com> References: <474981B6.9070504@t-online.de> <029a01c82f72$2dd489e0$5267a8c0@Jocke> <4749F02C.3080706@t-online.de> <1196079962.12625.53.camel@gentoo-jocke.transmode.se> <474C0849.2050406@t-online.de> <2ccd6e3c0711270927u565b87ect618a6b2ae8d4000e@mail.gmail.com> <474EA553.1090100@t-online.de> <475112F5.4040305@gmail.com> Message-ID: <4754B8A3.8050004@t-online.de> No one made further comments regarding the second set of patches, so I've now made another pass over them to fix a few problems and style issues, and checked them in. A few minor issues remain to be resolved to get the Blackfin support 100% working, and the FRV needs attention as well. I'll be working on getting Blackfin support complete in the near future. Bernd -- This footer brought to you by insane German lawmakers. Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif From carmelo.amoroso at st.com Mon Dec 3 23:36:31 2007 From: carmelo.amoroso at st.com (Carmelo AMOROSO) Date: Tue, 04 Dec 2007 08:36:31 +0100 Subject: [PATCH] Always inline system calls In-Reply-To: <20071203220747.GA14446@linux-sh.org> References: <1193673669.8359.28.camel@localhost.localdomain> <1195224317.8050.23.camel@localhost.localdomain> <473DBAC9.2070002@st.com> <20071119103030.7b594215@dhcp-255-175.norway.atmel.com> <474BC7BE.2030802@atmel.com> <1196150050.6777.13.camel@gentoo-jocke.transmode.se> <1196237053.10698.14.camel@localhost.localdomain> <47546FD3.3060003@gmail.com> <47547633.6010307@gmail.com> <20071203220747.GA14446@linux-sh.org> Message-ID: <4755037F.5070806@st.com> Paul Mundt wrote: > On Mon, Dec 03, 2007 at 10:33:39PM +0100, Carmelo Amoroso wrote: > >> Carmelo Amoroso wrote: >> while doing some test for SH4 to measure size increase for 'always inline' changes, >> doscovered suddenly that gcc-4.1.1 (cross sh4) fails with the following error: >> >> ../ldso/ldso/dl-elf.c: In function '_dl_dprintf': >> ../ldso/ldso/dl-elf.c:858: error: unable to find a register to spill in class 'R0_REGS' >> ../ldso/ldso/dl-elf.c:858: error: this is the insn: >> (insn 916 917 24 1 (set (reg/f:SI 1 r1 [219]) >> (mem/u/c:SI (plus:SI (reg:SI 12 r12) >> (reg/f:SI 1 r1 [220])) [0 S4 A32])) 172 {movsi_ie} (nil) >> (expr_list:REG_DEAD (reg/f:SI 1 r1 [220]) >> (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) >> (nil)))) >> >> either running with -Os or -O0. >> I'll test tomorrow with gcc-4.2.1 to see if it makes difference, >> otherwise I suspect we should go back on my proposal in using always inline >> only for arch strictly requiring it. >> >> > We've noticed this with some versions in buildroot also, so it seems to > be a more common issue: > > Indeed at home I have gcc from buildroot. Tested just now at office using gcc-4.2.1 from STMicro toolchain and it works fine, either using -O0 or -Os. I'll try to update buildroot at home with gcc 4.2.1 as well and see if it solves. I'll keep you informed. Cheers, Carmelo > CC ldso/ldso/ldso.oS > In file included from ./libpthread/linuxthreads.old/sysdeps/sh/tls.h:23, > from ./include/bits/uClibc_errno.h:35, > from ./include/errno.h:62, > from ./include/bits/syscalls.h:16, > from ./include/sys/syscall.h:34, > from ./ldso/ldso/sh/dl-syscalls.h:3, > from ./ldso/include/dl-syscall.h:12, > from ./ldso/include/ldso.h:36, > from ldso/ldso/ldso.c:33: > ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: C99 inline functions are not supported; using GNU89 > ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: to disable this warning use -fgnu89-inline or the gnu_inline function attribute > ldso/ldso/dl-elf.c: In function '_dl_dprintf': > ldso/ldso/dl-elf.c:803: error: unable to find a register to spill in class 'R0_REGS' > ldso/ldso/dl-elf.c:803: error: this is the insn: > (insn 884 885 23 3 (set (reg/f:SI 1 r1 [221]) > (mem/u/c:SI (plus:SI (reg:SI 12 r12) > (reg/f:SI 1 r1 [222])) [0 S4 A32])) 171 {movsi_ie} (nil) > (expr_list:REG_DEAD (reg/f:SI 1 r1 [222]) > (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) > (nil)))) > ldso/ldso/dl-elf.c:803: confused by earlier errors, bailing out > make[1]: *** [ldso/ldso/ldso.oS] Error 1 > > So inlining itself may not really be a problem, but it might be worthwhile > hunting down the code that generates the immediate load and seeing if that can > be forced in to a memory access instead, so we avoid the R0 encoding dependence. > These are all in relation to _dl_pagesize at least. > _______________________________________________ > uClibc mailing list > uClibc at uclibc.org > http://busybox.net/cgi-bin/mailman/listinfo/uclibc > > From carmelo.amoroso at st.com Tue Dec 4 00:39:00 2007 From: carmelo.amoroso at st.com (Carmelo AMOROSO) Date: Tue, 04 Dec 2007 09:39:00 +0100 Subject: svn commit: trunk/uClibc/ldso: include ldso ldso/bfin libdl In-Reply-To: <20071203225419.E6E8C12802B@busybox.net> References: <20071203225419.E6E8C12802B@busybox.net> Message-ID: <47551224.2040400@st.com> bernds at uclibc.org wrote: > Author: bernds > Date: 2007-12-03 14:54:16 -0800 (Mon, 03 Dec 2007) > New Revision: 20614 > > Log: > Blackfin FD-PIC patch 3/6. > Change _dl_find_hash to _dl_lookup_hash, as on the NPTL branch. > _dl_find_hash is now a wrapper function around it; unlike on the NPTL branch, > it retains the old interface so that not all callers need to be changed. > > _dl_lookup_hash can optionally give its caller a pointer to the module where > the symbol was found. > > Introduce ELF_RTYPE_CLASS_DLSYM for lookups from libdl. > > Spelling fixes in the Blackfin port, since Alex Oliva's original version of > these patches used _dl_find_hash_mod as the name of the function rather than > _dl_lookup_hash. > > > Modified: > trunk/uClibc/ldso/include/dl-defs.h > trunk/uClibc/ldso/include/dl-elf.h > trunk/uClibc/ldso/include/dl-hash.h > trunk/uClibc/ldso/ldso/bfin/elfinterp.c > trunk/uClibc/ldso/ldso/dl-hash.c > trunk/uClibc/ldso/libdl/libdl.c > > > [SNIP] > > Modified: trunk/uClibc/ldso/include/dl-hash.h > =================================================================== > --- trunk/uClibc/ldso/include/dl-hash.h 2007-12-03 22:46:53 UTC (rev 20613) > +++ trunk/uClibc/ldso/include/dl-hash.h 2007-12-03 22:54:16 UTC (rev 20614) > @@ -105,9 +105,23 @@ > DL_LOADADDR_TYPE loadaddr, unsigned long * dynamic_info, > unsigned long dynamic_addr, unsigned long dynamic_size); > > -extern char * _dl_find_hash(const char * name, struct dyn_elf * rpnt1, > - struct elf_resolve *mytpnt, int type_class); > +extern char * _dl_lookup_hash(const char * name, struct dyn_elf * rpnt, > + struct elf_resolve *mytpnt, int type_class > +#ifdef __FDPIC__ > + , struct elf_resolve **tpntp > +#endif > + ); > > +static __always_inline char *_dl_find_hash(const char *name, struct dyn_elf *rpnt, > + struct elf_resolve *mytpnt, int type_class) > +{ > +#ifdef __FDPIC__ > + return _dl_lookup_hash(name, rpnt, mytpnt, type_class, NULL); > +#else > + return _dl_lookup_hash(name, rpnt, mytpnt, type_class); > +#endif > +} > + > extern int _dl_linux_dynamic_link(void); > > extern char * _dl_library_path; > I think that when nptl merge will be completed, we could use something like that: #if defined USE_TLS || defined __FDPIC__ #define HASH_EXTRA_TPNT #else #undef HASH_EXTRA_TPNT #endif and use it in _dl_find_hash wrapper. I've understood that you are keeping _dl_find_hash just the same to not break all other arch, right? > Modified: trunk/uClibc/ldso/ldso/bfin/elfinterp.c > =================================================================== > --- trunk/uClibc/ldso/ldso/bfin/elfinterp.c 2007-12-03 22:46:53 UTC (rev 20613) > +++ trunk/uClibc/ldso/ldso/bfin/elfinterp.c 2007-12-03 22:54:16 UTC (rev 20614) > @@ -72,11 +72,9 @@ > got_entry = (struct funcdesc_value *) DL_RELOC_ADDR(tpnt->loadaddr, this_reloc->r_offset); > > /* Get the address to be used to fill in the GOT entry. */ > - new_addr = _dl_find_hash_mod(symname, tpnt->symbol_scope, NULL, 0, > - &new_tpnt); > + new_addr = _dl_lookup_hash(symname, tpnt->symbol_scope, NULL, 0, &new_tpnt); > if (!new_addr) { > - new_addr = _dl_find_hash_mod(symname, NULL, NULL, 0, > - &new_tpnt); > + new_addr = _dl_lookup_hash(symname, NULL, NULL, 0, &new_tpnt); > if (!new_addr) { > _dl_dprintf(2, "%s: can't resolve symbol '%s'\n", > _dl_progname, symname); > @@ -188,7 +186,7 @@ > } else { > > symbol_addr = (unsigned long) > - _dl_find_hash_mod(symname, scope, NULL, 0, &symbol_tpnt); > + _dl_lookup_hash(symname, scope, NULL, 0, &symbol_tpnt); > > /* > * We want to allow undefined references to weak symbols - this might > I expect to see, after nptl merge, all elfinterp.c calling always _dl_find_hash with the extra parameter passed: it will be NULL, if not used (not fdpic or not tls), not NULL otherwise. I think that mixing _dl_lookup_hash and _dl_find_hash invocation could create confusion. > Modified: trunk/uClibc/ldso/ldso/dl-hash.c > =================================================================== > --- trunk/uClibc/ldso/ldso/dl-hash.c 2007-12-03 22:46:53 UTC (rev 20613) > +++ trunk/uClibc/ldso/ldso/dl-hash.c 2007-12-03 22:54:16 UTC (rev 20614) > @@ -257,7 +257,12 @@ > * This function resolves externals, and this is either called when we process > * relocations or when we call an entry in the PLT table for the first time. > */ > -char *_dl_find_hash(const char *name, struct dyn_elf *rpnt, struct elf_resolve *mytpnt, int type_class) > +char *_dl_lookup_hash(const char *name, struct dyn_elf *rpnt, > + struct elf_resolve *mytpnt, int type_class > +#ifdef __FDPIC__ > + , struct elf_resolve **tpntp > +#endif > + ) > { > comment as above on HASH_EXTRA_TPNT > struct elf_resolve *tpnt = NULL; > ElfW(Sym) *symtab; > @@ -265,7 +270,8 @@ > unsigned long elf_hash_number = 0xffffffff; > const ElfW(Sym) *sym = NULL; > > - char *weak_result = NULL; > + const ElfW(Sym) *weak_sym = 0; > + struct elf_resolve *weak_tpnt = 0; > > #ifdef __LDSO_GNU_HASH_SUPPORT__ > unsigned long gnu_hash_number = _dl_gnu_hash((const unsigned char *)name); > @@ -326,15 +332,32 @@ > #if 0 > /* Perhaps we should support old style weak symbol handling > * per what glibc does when you export LD_DYNAMIC_WEAK */ > - if (!weak_result) > - weak_result = (char *) DL_RELOC_ADDR(tpnt->loadaddr, sym->st_value); > + if (!weak_sym) { > + weak_tpnt = tpnt; > + weak_sym = sym; > + } > break; > #endif > case STB_GLOBAL: > - return (char*) DL_RELOC_ADDR(tpnt->loadaddr, sym->st_value); > +#ifdef __FDPIC__ > + if (tpntp) > + *tpntp = tpnt; > +#endif > comment as above on HASH_EXTRA_TPNT > + return DL_FIND_HASH_VALUE (tpnt, type_class, sym); > default: /* Local symbols not handled here */ > break; > } > } > - return weak_result; > + if (weak_sym) { > +#ifdef __FDPIC__ > + if (tpntp) > + *tpntp = weak_tpnt; > +#endif > + return DL_FIND_HASH_VALUE (weak_tpnt, type_class, weak_sym); > + } > +#ifdef __FDPIC__ > + if (tpntp) > + *tpntp = NULL; > +#endif > + return NULL; > } > > same > Modified: trunk/uClibc/ldso/libdl/libdl.c > =================================================================== > --- trunk/uClibc/ldso/libdl/libdl.c 2007-12-03 22:46:53 UTC (rev 20613) > +++ trunk/uClibc/ldso/libdl/libdl.c 2007-12-03 22:54:16 UTC (rev 20614) > @@ -500,7 +500,7 @@ > tpnt = NULL; > if (handle == _dl_symbol_tables) > tpnt = handle->dyn; /* Only search RTLD_GLOBAL objs if global object */ > - ret = _dl_find_hash(name2, handle, tpnt, 0); > + ret = _dl_find_hash(name2, handle, tpnt, ELF_RTYPE_CLASS_DLSYM); > > /* > * Nothing found. > > I've not seen how ELF_RTYPE_CLASS_DLSYM is used... have I missed something? Cheers, Carmelo > _______________________________________________ > uClibc-cvs mailing list > uClibc-cvs at uclibc.org > http://busybox.net/cgi-bin/mailman/listinfo/uclibc-cvs > > From rep.dot.nop at gmail.com Tue Dec 4 04:04:31 2007 From: rep.dot.nop at gmail.com (Bernhard Fischer) Date: Tue, 4 Dec 2007 13:04:31 +0100 Subject: [PATCH] Always inline system calls In-Reply-To: <4755037F.5070806@st.com> References: <1195224317.8050.23.camel@localhost.localdomain> <473DBAC9.2070002@st.com> <20071119103030.7b594215@dhcp-255-175.norway.atmel.com> <474BC7BE.2030802@atmel.com> <1196150050.6777.13.camel@gentoo-jocke.transmode.se> <1196237053.10698.14.camel@localhost.localdomain> <47546FD3.3060003@gmail.com> <47547633.6010307@gmail.com> <20071203220747.GA14446@linux-sh.org> <4755037F.5070806@st.com> Message-ID: <20071204120431.GB29108@aon.at> On Tue, Dec 04, 2007 at 08:36:31AM +0100, Carmelo AMOROSO wrote: >Paul Mundt wrote: >> On Mon, Dec 03, 2007 at 10:33:39PM +0100, Carmelo Amoroso wrote: >> >>> Carmelo Amoroso wrote: >>> while doing some test for SH4 to measure size increase for 'always inline' changes, >>> doscovered suddenly that gcc-4.1.1 (cross sh4) fails with the following error: >>> >>> ../ldso/ldso/dl-elf.c: In function '_dl_dprintf': >>> ../ldso/ldso/dl-elf.c:858: error: unable to find a register to spill in class 'R0_REGS' >>> ../ldso/ldso/dl-elf.c:858: error: this is the insn: >>> (insn 916 917 24 1 (set (reg/f:SI 1 r1 [219]) >>> (mem/u/c:SI (plus:SI (reg:SI 12 r12) >>> (reg/f:SI 1 r1 [220])) [0 S4 A32])) 172 {movsi_ie} (nil) >>> (expr_list:REG_DEAD (reg/f:SI 1 r1 [220]) >>> (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) >>> (nil)))) >>> >>> either running with -Os or -O0. >>> I'll test tomorrow with gcc-4.2.1 to see if it makes difference, >>> otherwise I suspect we should go back on my proposal in using always inline >>> only for arch strictly requiring it. >>> >>> >> We've noticed this with some versions in buildroot also, so it seems to >> be a more common issue: >> >> >Indeed at home I have gcc from buildroot. Tested just now at office >using gcc-4.2.1 from STMicro toolchain >and it works fine, either using -O0 or -Os. >I'll try to update buildroot at home with gcc 4.2.1 as well and see if >it solves. >I'll keep you informed. > >Cheers, >Carmelo >> CC ldso/ldso/ldso.oS >> In file included from ./libpthread/linuxthreads.old/sysdeps/sh/tls.h:23, >> from ./include/bits/uClibc_errno.h:35, >> from ./include/errno.h:62, >> from ./include/bits/syscalls.h:16, >> from ./include/sys/syscall.h:34, >> from ./ldso/ldso/sh/dl-syscalls.h:3, >> from ./ldso/include/dl-syscall.h:12, >> from ./ldso/include/ldso.h:36, >> from ldso/ldso/ldso.c:33: >> ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: C99 inline functions are not supported; using GNU89 >> ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: to disable this warning use -fgnu89-inline or the gnu_inline function attribute This needs fixing anyway. See the patch in my buildroot repo. From carmelo.amoroso at st.com Tue Dec 4 04:31:13 2007 From: carmelo.amoroso at st.com (Carmelo AMOROSO) Date: Tue, 04 Dec 2007 13:31:13 +0100 Subject: [PATCH] Always inline system calls In-Reply-To: <20071204120431.GB29108@aon.at> References: <1195224317.8050.23.camel@localhost.localdomain> <473DBAC9.2070002@st.com> <20071119103030.7b594215@dhcp-255-175.norway.atmel.com> <474BC7BE.2030802@atmel.com> <1196150050.6777.13.camel@gentoo-jocke.transmode.se> <1196237053.10698.14.camel@localhost.localdomain> <47546FD3.3060003@gmail.com> <47547633.6010307@gmail.com> <20071203220747.GA14446@linux-sh.org> <4755037F.5070806@st.com> <20071204120431.GB29108@aon.at> Message-ID: <47554891.1090903@st.com> Bernhard Fischer wrote: > On Tue, Dec 04, 2007 at 08:36:31AM +0100, Carmelo AMOROSO wrote: > >> Paul Mundt wrote: >> >>> On Mon, Dec 03, 2007 at 10:33:39PM +0100, Carmelo Amoroso wrote: >>> >>> >>>> Carmelo Amoroso wrote: >>>> while doing some test for SH4 to measure size increase for 'always inline' changes, >>>> doscovered suddenly that gcc-4.1.1 (cross sh4) fails with the following error: >>>> >>>> ../ldso/ldso/dl-elf.c: In function '_dl_dprintf': >>>> ../ldso/ldso/dl-elf.c:858: error: unable to find a register to spill in class 'R0_REGS' >>>> ../ldso/ldso/dl-elf.c:858: error: this is the insn: >>>> (insn 916 917 24 1 (set (reg/f:SI 1 r1 [219]) >>>> (mem/u/c:SI (plus:SI (reg:SI 12 r12) >>>> (reg/f:SI 1 r1 [220])) [0 S4 A32])) 172 {movsi_ie} (nil) >>>> (expr_list:REG_DEAD (reg/f:SI 1 r1 [220]) >>>> (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) >>>> (nil)))) >>>> >>>> either running with -Os or -O0. >>>> I'll test tomorrow with gcc-4.2.1 to see if it makes difference, >>>> otherwise I suspect we should go back on my proposal in using always inline >>>> only for arch strictly requiring it. >>>> >>>> >>>> >>> We've noticed this with some versions in buildroot also, so it seems to >>> be a more common issue: >>> >>> >>> >> Indeed at home I have gcc from buildroot. Tested just now at office >> using gcc-4.2.1 from STMicro toolchain >> and it works fine, either using -O0 or -Os. >> I'll try to update buildroot at home with gcc 4.2.1 as well and see if >> it solves. >> I'll keep you informed. >> >> Cheers, >> Carmelo >> >>> CC ldso/ldso/ldso.oS >>> In file included from ./libpthread/linuxthreads.old/sysdeps/sh/tls.h:23, >>> from ./include/bits/uClibc_errno.h:35, >>> from ./include/errno.h:62, >>> from ./include/bits/syscalls.h:16, >>> from ./include/sys/syscall.h:34, >>> from ./ldso/ldso/sh/dl-syscalls.h:3, >>> from ./ldso/include/dl-syscall.h:12, >>> from ./ldso/include/ldso.h:36, >>> from ldso/ldso/ldso.c:33: >>> ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: C99 inline functions are not supported; using GNU89 >>> ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: to disable this warning use -fgnu89-inline or the gnu_inline function attribute >>> > > This needs fixing anyway. See the patch in my buildroot repo. > > Hi Bernd, could you provide the link... searching among gcc4.1.1 patches from buildroot svn I did find something related to this. Carmelo From rep.dot.nop at gmail.com Tue Dec 4 05:07:33 2007 From: rep.dot.nop at gmail.com (Bernhard Fischer) Date: Tue, 4 Dec 2007 14:07:33 +0100 Subject: [PATCH] Always inline system calls In-Reply-To: <47554891.1090903@st.com> References: <20071119103030.7b594215@dhcp-255-175.norway.atmel.com> <474BC7BE.2030802@atmel.com> <1196150050.6777.13.camel@gentoo-jocke.transmode.se> <1196237053.10698.14.camel@localhost.localdomain> <47546FD3.3060003@gmail.com> <47547633.6010307@gmail.com> <20071203220747.GA14446@linux-sh.org> <4755037F.5070806@st.com> <20071204120431.GB29108@aon.at> <47554891.1090903@st.com> Message-ID: <20071204130733.GA750@aon.at> On Tue, Dec 04, 2007 at 01:31:13PM +0100, Carmelo AMOROSO wrote: > Bernhard Fischer wrote: >> On Tue, Dec 04, 2007 at 08:36:31AM +0100, Carmelo AMOROSO wrote: >> >>> Paul Mundt wrote: >>> >>>> On Mon, Dec 03, 2007 at 10:33:39PM +0100, Carmelo Amoroso wrote: >>>> >>>>> Carmelo Amoroso wrote: >>>>> while doing some test for SH4 to measure size increase for 'always inline' changes, >>>>> doscovered suddenly that gcc-4.1.1 (cross sh4) fails with the following error: >>>>> >>>>> ../ldso/ldso/dl-elf.c: In function '_dl_dprintf': >>>>> ../ldso/ldso/dl-elf.c:858: error: unable to find a register to spill in class 'R0_REGS' >>>>> ../ldso/ldso/dl-elf.c:858: error: this is the insn: >>>>> (insn 916 917 24 1 (set (reg/f:SI 1 r1 [219]) >>>>> (mem/u/c:SI (plus:SI (reg:SI 12 r12) >>>>> (reg/f:SI 1 r1 [220])) [0 S4 A32])) 172 {movsi_ie} (nil) >>>>> (expr_list:REG_DEAD (reg/f:SI 1 r1 [220]) >>>>> (expr_list:REG_EQUIV (symbol_ref:SI ("_dl_pagesize") ) >>>>> (nil)))) >>>>> >>>>> either running with -Os or -O0. >>>>> I'll test tomorrow with gcc-4.2.1 to see if it makes difference, >>>>> otherwise I suspect we should go back on my proposal in using always inline >>>>> only for arch strictly requiring it. >>>>> >>>>> >>>> We've noticed this with some versions in buildroot also, so it seems to >>>> be a more common issue: >>>> >>>> >>> Indeed at home I have gcc from buildroot. Tested just now at office using >>> gcc-4.2.1 from STMicro toolchain >>> and it works fine, either using -O0 or -Os. >>> I'll try to update buildroot at home with gcc 4.2.1 as well and see if it >>> solves. >>> I'll keep you informed. >>> >>> Cheers, >>> Carmelo >>> >>>> CC ldso/ldso/ldso.oS >>>> In file included from ./libpthread/linuxthreads.old/sysdeps/sh/tls.h:23, >>>> from ./include/bits/uClibc_errno.h:35, >>>> from ./include/errno.h:62, >>>> from ./include/bits/syscalls.h:16, >>>> from ./include/sys/syscall.h:34, >>>> from ./ldso/ldso/sh/dl-syscalls.h:3, >>>> from ./ldso/include/dl-syscall.h:12, >>>> from ./ldso/include/ldso.h:36, >>>> from ldso/ldso/ldso.c:33: >>>> ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: C99 inline functions are not supported; using GNU89 >>>> ./libpthread/linuxthreads.old/sysdeps/sh/pt-machine.h:36: warning: to disable this warning use -fgnu89-inline or the gnu_inline function attribute >>>> >> >> This needs fixing anyway. See the patch in my buildroot repo. >> >> > Hi Bernd, > could you provide the link... searching among gcc4.1.1 patches from > buildroot svn > I did find something related to this. http://repo.or.cz/w/buildroot.git?a=tree;f=toolchain/uClibc;hb=HEAD From sjhill at realitydiluted.com Wed Dec 5 06:45:08 2007 From: sjhill at realitydiluted.com (Steven J. Hill) Date: Wed, 5 Dec 2007 08:45:08 -0600 Subject: NPTL merge status... Message-ID: <20071205144508.GA6301@real.realitydiluted.com> I just wanted people to know that NPTL merging is not standing still. I have Carmelo's latest changes and am integrating and testing a lot with MIPS to make sure nothing breaks for that architecture. I am spending time on this every evening right now, so I am really trying to get the merge wrapped up. Thanks for your patience. -Steve From solar at gentoo.org Wed Dec 5 12:51:33 2007 From: solar at gentoo.org (Ned Ludd) Date: Wed, 05 Dec 2007 12:51:33 -0800 Subject: NPTL merge status... In-Reply-To: <20071205144508.GA6301@real.realitydiluted.com> References: <20071205144508.GA6301@real.realitydiluted.com> Message-ID: <1196887893.13361.46.camel@hangover> On Wed, 2007-12-05 at 08:45 -0600, Steven J. Hill wrote: > I just wanted people to know that NPTL merging is not standing still. I > have Carmelo's latest changes and am integrating and testing a lot with > MIPS to make sure nothing breaks for that architecture. I am spending > time on this every evening right now, so I am really trying to get the > merge wrapped up. Thanks for your patience. thanks for the update. > > -Steve > _______________________________________________ > uClibc mailing list > uClibc at uclibc.org > http://busybox.net/cgi-bin/mailman/listinfo/uclibc > -- Ned Ludd Gentoo Linux From bernds_cb1 at t-online.de Thu Dec 6 05:46:39 2007 From: bernds_cb1 at t-online.de (Bernd Schmidt) Date: Thu, 06 Dec 2007 14:46:39 +0100 Subject: svn commit: trunk/uClibc/ldso: include ldso ldso/bfin libdl In-Reply-To: <47551224.2040400@st.com> References: <20071203225419.E6E8C12802B@busybox.net> <47551224.2040400@st.com> Message-ID: <4757FD3F.9000006@t-online.de> Carmelo AMOROSO wrote: > I think that when nptl merge will be completed, we could use something > like that: > > #if defined USE_TLS || defined __FDPIC__ > #define HASH_EXTRA_TPNT > #else > #undef HASH_EXTRA_TPNT > #endif > > and use it in _dl_find_hash wrapper. I'd much rather get rid of the ifdeffery and just always add the extra parameter to _dl_lookup_hash. > I've understood that you are keeping _dl_find_hash just the same to not > break all other arch, right? Yes. > I expect to see, after nptl merge, all elfinterp.c calling always > _dl_find_hash with the extra parameter passed: > it will be NULL, if not used (not fdpic or not tls), not NULL otherwise. > I think that mixing _dl_lookup_hash and _dl_find_hash invocation could > create confusion. I don't see how, but if we defined _dl_lookup_hash so as to always have the extra arg we can just get rid of _dl_find_hash, which would have the effect you want. Bernd -- This footer brought to you by insane German lawmakers. Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif From bernds_cb1 at t-online.de Thu Dec 6 06:41:09 2007 From: bernds_cb1 at t-online.de (Bernd Schmidt) Date: Thu, 06 Dec 2007 15:41:09 +0100 Subject: svn commit: trunk/uClibc/ldso: include ldso ldso/bfin libdl In-Reply-To: <47551224.2040400@st.com> References: <20071203225419.E6E8C12802B@busybox.net> <47551224.2040400@st.com> Message-ID: <47580A05.6050800@t-online.de> Carmelo AMOROSO wrote: >> Modified: trunk/uClibc/ldso/libdl/libdl.c >> =================================================================== >> --- trunk/uClibc/ldso/libdl/libdl.c 2007-12-03 22:46:53 UTC (rev 20613) >> +++ trunk/uClibc/ldso/libdl/libdl.c 2007-12-03 22:54:16 UTC (rev 20614) >> @@ -500,7 +500,7 @@ >> tpnt = NULL; >> if (handle == _dl_symbol_tables) >> tpnt = handle->dyn; /* Only search RTLD_GLOBAL objs if global object */ >> - ret = _dl_find_hash(name2, handle, tpnt, 0); >> + ret = _dl_find_hash(name2, handle, tpnt, ELF_RTYPE_CLASS_DLSYM); >> >> /* >> * Nothing found. >> >> > I've not seen how ELF_RTYPE_CLASS_DLSYM is used... have I missed something? > It's hidden away in the FRV/Blackfin specific directories: /* We want to return to dlsym() a function descriptor if the symbol turns out to be a function. */ #define DL_FIND_HASH_VALUE(TPNT, TYPE_CLASS, SYM) \ (((TYPE_CLASS) & ELF_RTYPE_CLASS_DLSYM) \ && ELF32_ST_TYPE((SYM)->st_info) == STT_FUNC \ ? _dl_funcdesc_for (DL_RELOC_ADDR ((TPNT)->loadaddr, (SYM)->st_value), \ (TPNT)->loadaddr.got_value) \ : DL_RELOC_ADDR ((TPNT)->loadaddr, (SYM)->st_value)) Bernd -- This footer brought to you by insane German lawmakers. Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif From chris at zankel.net Thu Dec 6 12:50:07 2007 From: chris at zankel.net (Chris Zankel) Date: Thu, 6 Dec 2007 12:50:07 -0800 Subject: Xtensa support for uClibc [8/9] Message-ID: <20071206205007.6AAD33086C@atlanta.zankel.net> Add support for Xtensa to uClibc [8/9]: Syscalls and startup file part 2. --- diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/pread_write.c uClibc-0.9.29/libc/sysdeps/linux/xtensa/pread_write.c --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/pread_write.c 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/pread_write.c 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,193 @@ +/* vi: set sw=4 ts=4: */ +/* + * Copyright (C) 2000-2006 Erik Andersen + * + * Licensed under the LGPL v2.1, see the file COPYING.LIB in this tarball. + */ +/* + * Based in part on the files + * ./sysdeps/unix/sysv/linux/pwrite.c, + * ./sysdeps/unix/sysv/linux/pread.c, + * sysdeps/posix/pread.c + * sysdeps/posix/pwrite.c + * from GNU libc 2.2.5, but reworked considerably... + */ + +#include +#include +#include +#include + +extern __typeof(pread) __libc_pread; +extern __typeof(pwrite) __libc_pwrite; +#ifdef __UCLIBC_HAS_LFS__ +extern __typeof(pread64) __libc_pread64; +extern __typeof(pwrite64) __libc_pwrite64; +#endif + +#include + +#ifdef __NR_pread + +# define __NR___syscall_pread __NR_pread +/* On Xtensa, 64-bit values are aligned in even/odd register pairs. */ +static inline _syscall6(ssize_t, __syscall_pread, int, fd, void *, buf, + size_t, count, int, pad, off_t, offset_hi, off_t, offset_lo); + +ssize_t __libc_pread(int fd, void *buf, size_t count, off_t offset) +{ + return __syscall_pread(fd, buf, count, 0, __LONG_LONG_PAIR(offset >> 31, offset)); +} +weak_alias(__libc_pread,pread) + +# ifdef __UCLIBC_HAS_LFS__ +ssize_t __libc_pread64(int fd, void *buf, size_t count, off64_t offset) +{ + uint32_t low = offset & 0xffffffff; + uint32_t high = offset >> 32; + return __syscall_pread(fd, buf, count, 0, __LONG_LONG_PAIR(high, low)); +} +weak_alias(__libc_pread64,pread64) +# endif /* __UCLIBC_HAS_LFS__ */ + +#endif /* __NR_pread */ + +#ifdef __NR_pwrite + +# define __NR___syscall_pwrite __NR_pwrite +/* On Xtensa, 64-bit values are aligned in even/odd register pairs. */ +static inline _syscall6(ssize_t, __syscall_pwrite, int, fd, const void *, buf, + size_t, count, int, pad, off_t, offset_hi, off_t, offset_lo); + +ssize_t __libc_pwrite(int fd, const void *buf, size_t count, off_t offset) +{ + return __syscall_pwrite(fd, buf, count, 0, __LONG_LONG_PAIR(offset >> 31, offset)); +} +weak_alias(__libc_pwrite,pwrite) + +# ifdef __UCLIBC_HAS_LFS__ +ssize_t __libc_pwrite64(int fd, const void *buf, size_t count, off64_t offset) +{ + uint32_t low = offset & 0xffffffff; + uint32_t high = offset >> 32; + return __syscall_pwrite(fd, buf, count, 0, __LONG_LONG_PAIR(high, low)); +} +weak_alias(__libc_pwrite64,pwrite64) +# endif /* __UCLIBC_HAS_LFS__ */ +#endif /* __NR_pwrite */ + +#if ! defined __NR_pread || ! defined __NR_pwrite +libc_hidden_proto(read) +libc_hidden_proto(write) +libc_hidden_proto(lseek) + +static ssize_t __fake_pread_write(int fd, void *buf, + size_t count, off_t offset, int do_pwrite) +{ + int save_errno; + ssize_t result; + off_t old_offset; + + /* Since we must not change the file pointer preserve the + * value so that we can restore it later. */ + if ((old_offset=lseek(fd, 0, SEEK_CUR)) == (off_t) -1) + return -1; + + /* Set to wanted position. */ + if (lseek(fd, offset, SEEK_SET) == (off_t) -1) + return -1; + + if (do_pwrite == 1) { + /* Write the data. */ + result = write(fd, buf, count); + } else { + /* Read the data. */ + result = read(fd, buf, count); + } + + /* Now we have to restore the position. If this fails we + * have to return this as an error. */ + save_errno = errno; + if (lseek(fd, old_offset, SEEK_SET) == (off_t) -1) + { + if (result == -1) + __set_errno(save_errno); + return -1; + } + __set_errno(save_errno); + return(result); +} + +# ifdef __UCLIBC_HAS_LFS__ +libc_hidden_proto(lseek64) + +static ssize_t __fake_pread_write64(int fd, void *buf, + size_t count, off64_t offset, int do_pwrite) +{ + int save_errno; + ssize_t result; + off64_t old_offset; + + /* Since we must not change the file pointer preserve the + * value so that we can restore it later. */ + if ((old_offset=lseek64(fd, 0, SEEK_CUR)) == (off64_t) -1) + return -1; + + /* Set to wanted position. */ + if (lseek64(fd, offset, SEEK_SET) == (off64_t) -1) + return -1; + + if (do_pwrite == 1) { + /* Write the data. */ + result = write(fd, buf, count); + } else { + /* Read the data. */ + result = read(fd, buf, count); + } + + /* Now we have to restore the position. */ + save_errno = errno; + if (lseek64(fd, old_offset, SEEK_SET) == (off64_t) -1) { + if (result == -1) + __set_errno (save_errno); + return -1; + } + __set_errno (save_errno); + return result; +} +# endif /* __UCLIBC_HAS_LFS__ */ +#endif /* ! defined __NR_pread || ! defined __NR_pwrite */ + +#ifndef __NR_pread +ssize_t __libc_pread(int fd, void *buf, size_t count, off_t offset) +{ + return __fake_pread_write(fd, buf, count, offset, 0); +} +weak_alias(__libc_pread,pread) + +# ifdef __UCLIBC_HAS_LFS__ +ssize_t __libc_pread64(int fd, void *buf, size_t count, off64_t offset) +{ + return __fake_pread_write64(fd, buf, count, offset, 0); +} +weak_alias(__libc_pread64,pread64) +# endif /* __UCLIBC_HAS_LFS__ */ +#endif /* ! __NR_pread */ + +#ifndef __NR_pwrite +ssize_t __libc_pwrite(int fd, const void *buf, size_t count, off_t offset) +{ + /* we won't actually be modifying the buffer, + *just cast it to get rid of warnings */ + return __fake_pread_write(fd, (void*)buf, count, offset, 1); +} +weak_alias(__libc_pwrite,pwrite) + +# ifdef __UCLIBC_HAS_LFS__ +ssize_t __libc_pwrite64(int fd, const void *buf, size_t count, off64_t offset) +{ + return __fake_pread_write64(fd, (void*)buf, count, offset, 1); +} +weak_alias(__libc_pwrite64,pwrite64) +# endif /* __UCLIBC_HAS_LFS__ */ +#endif /* ! __NR_pwrite */ diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/setjmp.S uClibc-0.9.29/libc/sysdeps/linux/xtensa/setjmp.S --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/setjmp.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/setjmp.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,131 @@ +/* setjmp for Xtensa Processors. + Copyright (C) 2001, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +/* This implementation relies heavily on the Xtensa register window + mechanism. Setjmp flushes all the windows except its own to the + stack and then copies registers from the save areas on the stack + into the jmp_buf structure, along with the return address of the call + to setjmp. Longjmp invalidates all the windows except its own, and + then sets things up so that it will return to the right place, + using a window underflow to automatically restore the registers. + + Note that it would probably be sufficient to only copy the + registers from setjmp's caller into jmp_buf. However, we also copy + the save area located at the stack pointer of setjmp's caller. + This save area will typically remain intact until the longjmp call. + The one exception is when there is an intervening alloca in + setjmp's caller. This is certainly an unusual situation and is + likely to cause problems in any case (the storage allocated on the + stack cannot be safely accessed following the longjmp). As bad as + it is, on most systems this situation would not necessarily lead to + a catastrophic failure. If we did not preserve the extra save area + on Xtensa, however, it would. When setjmp's caller returns after a + longjmp, there will be a window underflow; an invalid return + address or stack pointer in the save area will almost certainly + lead to a crash. Keeping a copy of the extra save area in the + jmp_buf avoids this with only a small additional cost. If setjmp + and longjmp are ever time-critical, this could be removed. */ + +#include "sysdep.h" + +/* int setjmp (a2 = jmp_buf env) */ + +ENTRY (_setjmp) + movi a3, 0 + j 1f +END (_setjmp) +libc_hidden_def (_setjmp) + +ENTRY (setjmp) + movi a3, 1 + j 1f +END (setjmp) + +/* int __sigsetjmp (a2 = jmp_buf env, + a3 = int savemask) */ + +ENTRY (__sigsetjmp) +1: + /* Flush registers. */ + movi a4, __window_spill + callx4 a4 + + /* Preserve the second argument (savemask) in a15. The selection + of a15 is arbitrary, except it's otherwise unused. There is no + risk of triggering a window overflow since we just returned + from __window_spill(). */ + mov a15, a3 + + /* Copy the register save area at (sp - 16). */ + addi a5, a1, -16 + l32i a3, a5, 0 + l32i a4, a5, 4 + s32i a3, a2, 0 + s32i a4, a2, 4 + l32i a3, a5, 8 + l32i a4, a5, 12 + s32i a3, a2, 8 + s32i a4, a2, 12 + + /* Copy 0-8 words from the register overflow area. */ + extui a3, a0, 30, 2 + blti a3, 2, .Lendsj + l32i a7, a1, 4 + slli a4, a3, 4 + sub a5, a7, a4 + addi a6, a2, 16 + addi a7, a7, -16 // a7 = end of register overflow area +.Lsjloop: + l32i a3, a5, 0 + l32i a4, a5, 4 + s32i a3, a6, 0 + s32i a4, a6, 4 + l32i a3, a5, 8 + l32i a4, a5, 12 + s32i a3, a6, 8 + s32i a4, a6, 12 + addi a5, a5, 16 + addi a6, a6, 16 + blt a5, a7, .Lsjloop +.Lendsj: + + /* Copy the register save area at sp. */ + l32i a3, a1, 0 + l32i a4, a1, 4 + s32i a3, a2, 48 + s32i a4, a2, 52 + l32i a3, a1, 8 + l32i a4, a1, 12 + s32i a3, a2, 56 + s32i a4, a2, 60 + + /* Save the return address, including the window size bits. */ + s32i a0, a2, 64 + + /* a2 still addresses jmp_buf. a15 contains savemask. */ + mov a6, a2 + mov a7, a15 + movi a3, __sigjmp_save + callx4 a3 + mov a2, a6 + retw +END(__sigsetjmp) + +weak_extern(_setjmp) +weak_extern(setjmp) diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/syscall.S uClibc-0.9.29/libc/sysdeps/linux/xtensa/syscall.S --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/syscall.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/syscall.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,42 @@ +/* Copyright (C) 2005, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#include "sysdep.h" + +/* The register layout upon entering the function is: + + arguments syscall number arg0, arg1, arg2, arg3, arg4, arg5 + --------- -------------- ---------------------------------- + function a2 a3, a4, a5, a6, a7, (sp) + syscall a2 a6, a3, a4, a5, a8, a9 + */ + +ENTRY (syscall) + l32i a9, a1, 16 /* load extra argument from stack */ + mov a8, a7 + mov a7, a3 /* preserve a3 in a7 */ + mov a3, a4 + mov a4, a5 + mov a5, a6 + mov a6, a7 + syscall + movi a4, -4095 + bgeu a2, a4, SYSCALL_ERROR_LABEL +.Lpseudo_end: + retw +PSEUDO_END (syscall) diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/sysdep.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/sysdep.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/sysdep.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/sysdep.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,160 @@ +/* Assembler macros for Xtensa processors. + Copyright (C) 2001, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#ifdef __ASSEMBLER__ + +#define ALIGNARG(log2) 1 << log2 +#define ASM_TYPE_DIRECTIVE(name, typearg) .type name, typearg +#define ASM_SIZE_DIRECTIVE(name) .size name, . - name + +#ifdef __STDC__ +#define C_LABEL(name) name : +#else +#define C_LABEL(name) name/**/: +#endif + +#define ENTRY(name) \ + ASM_GLOBAL_DIRECTIVE C_SYMBOL_NAME(name); \ + ASM_TYPE_DIRECTIVE (C_SYMBOL_NAME(name), @function); \ + .align ALIGNARG(2); \ + LITERAL_POSITION; \ + C_LABEL(name) \ + entry sp, FRAMESIZE; \ + CALL_MCOUNT + +#undef END +#define END(name) ASM_SIZE_DIRECTIVE(name) + +/* Define a macro for this directive so it can be removed in a few places. */ +#define LITERAL_POSITION .literal_position + +#undef JUMPTARGET +#ifdef PIC +/* The "@PLT" suffix is currently a no-op for non-shared linking, but + it doesn't hurt to use it conditionally for PIC code in case that + changes someday. */ +#define JUMPTARGET(name) name##@PLT +#else +#define JUMPTARGET(name) name +#endif + +#define FRAMESIZE 16 +#define CALL_MCOUNT /* Do nothing. */ + + +/* Linux uses a negative return value to indicate syscall errors, + unlike most Unices, which use the condition codes' carry flag. + + Since version 2.1 the return value of a system call might be + negative even if the call succeeded. E.g., the `lseek' system call + might return a large offset. Therefore we must not anymore test + for < 0, but test for a real error by making sure the value in a2 + is a real error number. Linus said he will make sure the no syscall + returns a value in -1 .. -4095 as a valid result so we can safely + test with -4095. */ + +/* We don't want the label for the error handler to be global when we define + it here. */ +#define SYSCALL_ERROR_LABEL 0f + +#undef PSEUDO +#define PSEUDO(name, syscall_name, args) \ + .text; \ + ENTRY (name) \ + DO_CALL (syscall_name, args); \ + movi a4, -4095; \ + bgeu a2, a4, SYSCALL_ERROR_LABEL; \ + .Lpseudo_end: + +#undef PSEUDO_END +#define PSEUDO_END(name) \ + SYSCALL_ERROR_HANDLER \ + END (name) + +#undef PSEUDO_NOERRNO +#define PSEUDO_NOERRNO(name, syscall_name, args) \ + .text; \ + ENTRY (name) \ + DO_CALL (syscall_name, args) + +#undef PSEUDO_END_NOERRNO +#define PSEUDO_END_NOERRNO(name) \ + END (name) + +#undef ret_NOERRNO +#define ret_NOERRNO retw + +/* The function has to return the error code. */ +#undef PSEUDO_ERRVAL +#define PSEUDO_ERRVAL(name, syscall_name, args) \ + .text; \ + ENTRY (name) \ + DO_CALL (syscall_name, args); \ + neg a2, a2 + +#undef PSEUDO_END_ERRVAL +#define PSEUDO_END_ERRVAL(name) \ + END (name) + +#define ret_ERRVAL retw + +#if RTLD_PRIVATE_ERRNO +# define SYSCALL_ERROR_HANDLER \ +0: movi a4, rtld_errno; \ + neg a2, a2; \ + s32i a2, a4, 0; \ + movi a2, -1; \ + j .Lpseudo_end; + +#elif defined _LIBC_REENTRANT + +# if USE___THREAD +# ifndef NOT_IN_libc +# define SYSCALL_ERROR_ERRNO __libc_errno +# else +# define SYSCALL_ERROR_ERRNO errno +# endif +# define SYSCALL_ERROR_HANDLER \ +0: rur a4, THREADPTR; \ + movi a3, SYSCALL_ERROR_ERRNO at TPOFF; \ + neg a2, a2; \ + add a4, a4, a3; \ + s32i a2, a4, 0; \ + movi a2, -1; \ + j .Lpseudo_end; +# else /* !USE___THREAD */ +# define SYSCALL_ERROR_HANDLER \ +0: neg a2, a2; \ + mov a6, a2; \ + movi a4, __errno_location at PLT; \ + callx4 a4; \ + s32i a2, a6, 0; \ + movi a2, -1; \ + j .Lpseudo_end; +# endif /* !USE___THREAD */ +#else /* !_LIBC_REENTRANT */ +#define SYSCALL_ERROR_HANDLER \ +0: movi a4, errno; \ + neg a2, a2; \ + s32i a2, a4, 0; \ + movi a2, -1; \ + j .Lpseudo_end; +#endif /* _LIBC_REENTRANT */ + +#endif /* __ASSEMBLER__ */ diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/vfork.S uClibc-0.9.29/libc/sysdeps/linux/xtensa/vfork.S --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/vfork.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/vfork.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,170 @@ +/* Copyright (C) 2005, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#include "sysdep.h" +#include +#define _SIGNAL_H +#include + + +/* Clone the calling process, but without copying the whole address space. + The calling process is suspended until the new process exits or is + replaced by a call to `execve'. Return -1 for errors, 0 to the new process, + and the process ID of the new process to the old process. + + Note that it is important that we don't create a new stack frame for the + caller. */ + + +/* The following are defined in linux/sched.h, which unfortunately + is not safe for inclusion in an assembly file. */ +#define CLONE_VM 0x00000100 /* set if VM shared between processes */ +#define CLONE_VFORK 0x00004000 /* set if the parent wants the child to + wake it up on mm_release */ + +#ifndef SAVE_PID +#define SAVE_PID +#endif + +#ifndef RESTORE_PID +#define RESTORE_PID +#endif + + +/* pid_t vfork(void); + Implemented as __clone_syscall(CLONE_VFORK | CLONE_VM | SIGCHLD, 0) */ + +ENTRY (__vfork) + + movi a6, .Ljumptable + extui a2, a0, 30, 2 // call-size: call4/8/12 = 1/2/3 + addx4 a4, a2, a6 // find return address in jumptable + l32i a4, a4, 0 + add a4, a4, a6 + + slli a2, a2, 30 + xor a3, a0, a2 // remove call-size from return address + extui a5, a4, 30, 2 // get high bits of jump target + slli a5, a5, 30 + or a3, a3, a5 // stuff them into the return address + xor a4, a4, a5 // clear high bits of jump target + or a0, a4, a2 // create temporary return address + retw // "return" to .L4, .L8, or .L12 + + .align 4 +.Ljumptable: + .word 0 + .word .L4 - .Ljumptable + .word .L8 - .Ljumptable + .word .L12 - .Ljumptable + + /* a7: return address */ +.L4: mov a12, a2 + mov a13, a3 + + SAVE_PID + + /* Use syscall 'clone'. Set new stack pointer to the same address. */ + movi a2, SYS_ify (clone) + movi a3, 0 + movi a6, CLONE_VM | CLONE_VFORK | SIGCHLD + syscall + + RESTORE_PID + + movi a5, -4096 + + mov a6, a2 + mov a2, a12 + mov a3, a13 + + bgeu a6, a5, 1f + jx a7 +1: call4 .Lerr // returns to original caller + + + /* a11: return address */ +.L8: mov a12, a2 + mov a13, a3 + mov a14, a6 + + SAVE_PID + + movi a2, SYS_ify (clone) + movi a3, 0 + movi a6, CLONE_VM | CLONE_VFORK | SIGCHLD + syscall + + RESTORE_PID + + movi a9, -4096 + + mov a10, a2 + mov a2, a12 + mov a3, a13 + mov a6, a14 + + bgeu a10, a9, 1f + jx a11 +1: call8 .Lerr // returns to original caller + + + /* a15: return address */ +.L12: mov a12, a2 + mov a13, a3 + mov a14, a6 + + SAVE_PID + + movi a2, SYS_ify (clone) + movi a3, 0 + movi a6, CLONE_VM | CLONE_VFORK | SIGCHLD + syscall + + RESTORE_PID + + mov a3, a13 + movi a13, -4096 + + mov a6, a14 + mov a14, a2 + + mov a2, a12 + + bgeu a14, a13, 1f + jx a15 +1: call12 .Lerr // returns to original caller + + + .align 4 +.Lerr: entry a1, 16 + + /* Restore the return address. */ + extui a4, a0, 30, 2 // get the call-size bits + slli a4, a4, 30 + slli a3, a3, 2 // clear high bits of target address + srli a3, a3, 2 + or a0, a3, a4 // combine them + + PSEUDO_END (__vfork) +.Lpseudo_end: + retw + +libc_hidden_def (__vfork) + +weak_alias (__vfork, vfork) diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/windowspill.S uClibc-0.9.29/libc/sysdeps/linux/xtensa/windowspill.S --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/windowspill.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/windowspill.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,95 @@ +/* Function to force register windows to the stack. + Copyright (C) 2005, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#include + + .text + .align 4 + .literal_position + .global __window_spill + .type __window_spill, @function +__window_spill: + entry a1, 48 + bbci.l a0, 31, .L4 // branch if called with call4 + bbsi.l a0, 30, .L12 // branch if called with call12 + + /* Called with call8: touch register NUM_REGS-12 (4/20/52) */ +.L8: +#if XCHAL_NUM_AREGS > 16 + call12 1f + retw + + .align 4 +1: _entry a1, 48 // touch NUM_REGS-24 (x/8/40) + +#if XCHAL_NUM_AREGS == 32 + mov a8, a0 + retw +#else + _entry a1, 48 // touch NUM_REGS-36 (x/x/28) + mov a12, a0 + _entry a1, 48 // touch NUM_REGS-48 (x/x/16) + mov a12, a0 + _entry a1, 16 // touch NUM_REGS-60 (x/x/4) +#endif +#endif + mov a4, a0 + retw + + /* Called with call4: touch register NUM_REGS-8 (8/24/56) */ +.L4: +#if XCHAL_NUM_AREGS == 16 + mov a8, a0 +#else + call12 1f + retw + + .align 4 +1: _entry a1, 48 // touch NUM_REGS-20 (x/12/44) + mov a12, a0 +#if XCHAL_NUM_AREGS > 32 + _entry a1, 48 // touch NUM_REGS-32 (x/x/32) + mov a12, a0 + _entry a1, 48 // touch NUM_REGS-44 (x/x/20) + mov a12, a0 + _entry a1, 48 // touch NUM_REGS-56 (x/x/8) + mov a8, a0 +#endif +#endif + retw + + /* Called with call12: touch register NUM_REGS-16 (x/16/48) */ +.L12: +#if XCHAL_NUM_AREGS > 16 + call12 1f + retw + + .align 4 +1: _entry a1, 48 // touch NUM_REGS-28 (x/4/36) +#if XCHAL_NUM_AREGS == 32 + mov a4, a0 +#else + mov a12, a0 + _entry a1, 48 // touch NUM_REGS-40 (x/x/24) + mov a12, a0 + _entry a1, 48 // touch NUM_REGS-52 (x/x/12) + mov a12, a0 +#endif +#endif + retw From chris at zankel.net Thu Dec 6 12:50:00 2007 From: chris at zankel.net (Chris Zankel) Date: Thu, 6 Dec 2007 12:50:00 -0800 Subject: Xtensa support for uClibc [5/9] Message-ID: <20071206205000.12C1C30868@atlanta.zankel.net> Add support for Xtensa to uClibc [5/9]: Various bits header files part 2. --- diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/setjmp.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/setjmp.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/setjmp.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/setjmp.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,46 @@ +/* Copyright (C) 1997, 1998, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +/* Define the machine-dependent type `jmp_buf'. Xtensa version. */ +#ifndef _BITS_SETJMP_H +#define _BITS_SETJMP_H 1 + +#if !defined _SETJMP_H && !defined _PTHREAD_H +# error "Never include directly; use instead." +#endif + +/* The jmp_buf structure for Xtensa holds the following (where "proc" + is the procedure that calls setjmp): 4-12 registers from the window + of proc, the 4 words from the save area at proc's $sp (in case a + subsequent alloca in proc moves $sp), and the return address within + proc. Everything else is saved on the stack in the normal save areas. */ + +#ifndef _ASM +typedef int __jmp_buf[17]; +#endif + +#define JB_SP 1 +#define JB_PC 16 + +/* Test if longjmp to JMPBUF would unwind the frame containing a local + variable at ADDRESS. */ + +#define _JMPBUF_UNWINDS(jmpbuf, address) \ + ((void *) (address) < (void *) (jmpbuf)[JB_SP]) + +#endif /* bits/setjmp.h */ diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/shm.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/shm.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/shm.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/shm.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,115 @@ +/* Copyright (C) 1995, 1996, 1997, 2000, 2002, 2007 + Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#ifndef _SYS_SHM_H +# error "Never include directly; use instead." +#endif + +#include + +/* Permission flag for shmget. */ +#define SHM_R 0400 /* or S_IRUGO from */ +#define SHM_W 0200 /* or S_IWUGO from */ + +/* Flags for `shmat'. */ +#define SHM_RDONLY 010000 /* attach read-only else read-write */ +#define SHM_RND 020000 /* round attach address to SHMLBA */ +#define SHM_REMAP 040000 /* take-over region on attach */ + +/* Commands for `shmctl'. */ +#define SHM_LOCK 11 /* lock segment (root only) */ +#define SHM_UNLOCK 12 /* unlock segment (root only) */ + +__BEGIN_DECLS + +/* Segment low boundary address multiple. */ +#define SHMLBA (__getpagesize ()) +extern int __getpagesize (void) __THROW __attribute__ ((__const__)); + + +/* Type to count number of attaches. */ +typedef unsigned long int shmatt_t; + +/* Data structure describing a set of semaphores. */ +struct shmid_ds + { + struct ipc_perm shm_perm; /* operation permission struct */ + size_t shm_segsz; /* size of segment in bytes */ +#if defined (__XTENSA_EL__) + __time_t shm_atime; /* time of last shmat() */ + unsigned long int __unused1; + __time_t shm_dtime; /* time of last shmdt() */ + unsigned long int __unused2; + __time_t shm_ctime; /* time of last change by shmctl() */ + unsigned long int __unused3; +#elif defined (__XTENSA_EB__) + unsigned long int __unused1; + __time_t shm_atime; /* time of last shmat() */ + unsigned long int __unused2; + __time_t shm_dtime; /* time of last shmdt() */ + unsigned long int __unused3; + __time_t shm_ctime; /* time of last change by shmctl() */ +#else +# error endian order not defined +#endif + __pid_t shm_cpid; /* pid of creator */ + __pid_t shm_lpid; /* pid of last shmop */ + shmatt_t shm_nattch; /* number of current attaches */ + unsigned long int __unused4; + unsigned long int __unused5; + }; + +#ifdef __USE_MISC + +/* ipcs ctl commands */ +# define SHM_STAT 13 +# define SHM_INFO 14 + +/* shm_mode upper byte flags */ +# define SHM_DEST 01000 /* segment will be destroyed on last detach */ +# define SHM_LOCKED 02000 /* segment will not be swapped */ +# define SHM_HUGETLB 04000 /* segment is mapped via hugetlb */ +# define SHM_NORESERVE 010000 /* don't check for reservations */ + +struct shminfo + { + unsigned long int shmmax; + unsigned long int shmmin; + unsigned long int shmmni; + unsigned long int shmseg; + unsigned long int shmall; + unsigned long int __unused1; + unsigned long int __unused2; + unsigned long int __unused3; + unsigned long int __unused4; + }; + +struct shm_info + { + int used_ids; + unsigned long int shm_tot; /* total allocated shm */ + unsigned long int shm_rss; /* total resident shm */ + unsigned long int shm_swp; /* total swapped shm */ + unsigned long int swap_attempts; + unsigned long int swap_successes; + }; + +#endif /* __USE_MISC */ + +__END_DECLS diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/sigcontextinfo.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/sigcontextinfo.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/sigcontextinfo.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/sigcontextinfo.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,33 @@ +/* Copyright (C) 2003, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +/* Also see register-dump.h, where we spill live registers to the + stack so that we can trace the stack backward. */ + +#define SIGCONTEXT unsigned long _info, ucontext_t * +#define SIGCONTEXT_EXTRA_ARGS _info, + +/* ANDing with 0x3fffffff clears the window-size bits. + Assumes TASK_SIZE = 0x40000000. */ + +#define GET_PC(ctx) ((void *) (ctx->uc_mcontext.sc_pc & 0x3fffffff)) +#define GET_FRAME(ctx) ((void *) ctx->uc_mcontext.sc_a[1]) +#define GET_STACK(ctx) ((void *) ctx->uc_mcontext.sc_a[1]) +#define CALL_SIGHANDLER(handler, signo, ctx) \ + (handler)((signo), SIGCONTEXT_EXTRA_ARGS (ctx)) + diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/stackinfo.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/stackinfo.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/stackinfo.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/stackinfo.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,28 @@ +/* Copyright (C) 2000, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +/* This file contains a bit of information about the stack allocation + of the processor. */ + +#ifndef _STACKINFO_H +#define _STACKINFO_H 1 + +/* On Xtensa the stack grows down. */ +#define _STACK_GROWS_DOWN 1 + +#endif /* stackinfo.h */ diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/stat.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/stat.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/stat.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/stat.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,153 @@ +/* Copyright (C) 1992, 1995-2005, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#ifndef _SYS_STAT_H +# error "Never include directly; use instead." +#endif + +/* Versions of the `struct stat' data structure. */ +#define _STAT_VER_KERNEL 0 +#define _STAT_VER_LINUX 1 +#define _STAT_VER _STAT_VER_LINUX + +/* Versions of the `xmknod' interface. */ +#define _MKNOD_VER_LINUX 0 + + +struct stat + { + __dev_t st_dev; /* Device. */ +#ifndef __USE_FILE_OFFSET64 + __ino_t st_ino; /* File serial number. */ +#else + __ino64_t st_ino; /* File serial number. */ +#endif + __mode_t st_mode; /* File mode. */ + __nlink_t st_nlink; /* Link count. */ + __uid_t st_uid; /* User ID of the file's owner. */ + __gid_t st_gid; /* Group ID of the file's group.*/ + __dev_t st_rdev; /* Device number, if device. */ +#ifndef __USE_FILE_OFFSET64 + __off_t st_size; /* Size of file, in bytes. */ +#else + __off64_t st_size; /* Size of file, in bytes. */ +#endif + __blksize_t st_blksize; /* Optimal block size for I/O. */ + +#ifndef __USE_FILE_OFFSET64 + __blkcnt_t st_blocks; /* Number 512-byte blocks allocated. */ +#else + unsigned long __pad2; + __blkcnt64_t st_blocks; /* Number 512-byte blocks allocated. */ +#endif +#if 0 /*def __USE_MISC*/ + /* Nanosecond resolution timestamps are stored in a format + equivalent to 'struct timespec'. This is the type used + whenever possible but the Unix namespace rules do not allow the + identifier 'timespec' to appear in the header. + Therefore we have to handle the use of this header in strictly + standard-compliant sources special. */ + struct timespec st_atim; /* Time of last access. */ + struct timespec st_mtim; /* Time of last modification. */ + struct timespec st_ctim; /* Time of last status change. */ +# define st_atime st_atim.tv_sec /* Backward compatibility. */ +# define st_mtime st_mtim.tv_sec +# define st_ctime st_ctim.tv_sec +#else + __time_t st_atime; /* Time of last access. */ + unsigned long int st_atimensec; /* Nscecs of last access. */ + __time_t st_mtime; /* Time of last modification. */ + unsigned long int st_mtimensec; /* Nsecs of last modification. */ + __time_t st_ctime; /* Time of last status change. */ + unsigned long int st_ctimensec; /* Nsecs of last status change. */ +#endif + unsigned long int __unused4; + unsigned long int __unused5; + }; + +#ifdef __USE_LARGEFILE64 +struct stat64 + { + __dev_t st_dev; /* Device. */ + __ino64_t st_ino; /* File serial number. */ + __mode_t st_mode; /* File mode. */ + __nlink_t st_nlink; /* Link count. */ + __uid_t st_uid; /* User ID of the file's owner. */ + __gid_t st_gid; /* Group ID of the file's group.*/ + __dev_t st_rdev; /* Device number, if device. */ + __off64_t st_size; /* Size of file, in bytes. */ + __blksize_t st_blksize; /* Optimal block size for I/O. */ + + unsigned long __pad2; + __blkcnt64_t st_blocks; /* Number 512-byte blocks allocated. */ +#if 0 /*def __USE_MISC*/ + /* Nanosecond resolution timestamps are stored in a format + equivalent to 'struct timespec'. This is the type used + whenever possible but the Unix namespace rules do not allow the + identifier 'timespec' to appear in the header. + Therefore we have to handle the use of this header in strictly + standard-compliant sources special. */ + struct timespec st_atim; /* Time of last access. */ + struct timespec st_mtim; /* Time of last modification. */ + struct timespec st_ctim; /* Time of last status change. */ +#else + __time_t st_atime; /* Time of last access. */ + unsigned long int st_atimensec; /* Nscecs of last access. */ + __time_t st_mtime; /* Time of last modification. */ + unsigned long int st_mtimensec; /* Nsecs of last modification. */ + __time_t st_ctime; /* Time of last status change. */ + unsigned long int st_ctimensec; /* Nsecs of last status change. */ +#endif + unsigned long __unused4; + unsigned long __unused5; + }; +#endif + +/* Tell code we have these members. */ +#define _STATBUF_ST_BLKSIZE +#define _STATBUF_ST_RDEV +/* Nanosecond resolution time values are supported. */ +#define _STATBUF_ST_NSEC + +/* Encoding of the file mode. */ + +#define __S_IFMT 0170000 /* These bits determine file type. */ + +/* File types. */ +#define __S_IFDIR 0040000 /* Directory. */ +#define __S_IFCHR 0020000 /* Character device. */ +#define __S_IFBLK 0060000 /* Block device. */ +#define __S_IFREG 0100000 /* Regular file. */ +#define __S_IFIFO 0010000 /* FIFO. */ +#define __S_IFLNK 0120000 /* Symbolic link. */ +#define __S_IFSOCK 0140000 /* Socket. */ + +/* POSIX.1b objects. Note that these macros always evaluate to zero. But + they do it by enforcing the correct use of the macros. */ +#define __S_TYPEISMQ(buf) ((buf)->st_mode - (buf)->st_mode) +#define __S_TYPEISSEM(buf) ((buf)->st_mode - (buf)->st_mode) +#define __S_TYPEISSHM(buf) ((buf)->st_mode - (buf)->st_mode) + +/* Protection bits. */ + +#define __S_ISUID 04000 /* Set user ID on execution. */ +#define __S_ISGID 02000 /* Set group ID on execution. */ +#define __S_ISVTX 01000 /* Save swapped text after use (sticky). */ +#define __S_IREAD 0400 /* Read by owner. */ +#define __S_IWRITE 0200 /* Write by owner. */ +#define __S_IEXEC 0100 /* Execute by owner. */ diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/syscalls.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/syscalls.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/syscalls.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/syscalls.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,140 @@ +#ifndef _BITS_SYSCALLS_H +#define _BITS_SYSCALLS_H +#ifndef _SYSCALL_H +# error "Never use directly; include instead." +#endif + +/* + Some of the sneaky macros in the code were taken from + glibc .../sysdeps/unix/sysv/linux/xtensa/sysdep.h +*/ + +#define SYS_ify(syscall_name) __NR_##syscall_name + +#ifdef __ASSEMBLER__ + +/* The register layout upon entering the function is: + + return addr stack ptr arg0, arg1, arg2, arg3, arg4, arg5 + ----------- --------- ---------------------------------- + a0 a1 a2, a3, a4, a5, a6, a7 + + (Of course a function with say 3 arguments does not have entries for + arguments 4, 5, and 6.) + + Linux takes system-call arguments in registers. The ABI and Xtensa + software conventions require the system-call number in a2. We move any + argument that was in a2 to a7, and a7 to a8 if we have all 6 arguments. + Note that for improved efficiency, we do NOT shift all parameters down + one register to maintain the original order. + + syscall number arg0, arg1, arg2, arg3, arg4, arg5 + -------------- ---------------------------------- + a2 a6, a3, a4, a5, a8, a9 + + Upon return, a2 and a3 are clobbered; all other registers are preserved. */ + +#undef DO_CALL +#define DO_CALL(syscall_name, nargs) \ + DO_ARGS_##nargs \ + movi a2, SYS_ify (syscall_name); \ + syscall + +#define DO_ARGS_0 +#define DO_ARGS_1 mov a6, a2; +#define DO_ARGS_2 mov a6, a2; +#define DO_ARGS_3 mov a6, a2; +#define DO_ARGS_4 mov a6, a2; +#define DO_ARGS_5 mov a8, a6; mov a6, a2; +#define DO_ARGS_6 mov a9, a7; mov a8, a6; mov a6, a2; + +#else /* not __ASSEMBLER__ */ + +#include + +#define STR(s) #s +#define LD_ARG(n,ar) register int _a##n asm (STR(a##n)) = (int) (ar) + +#define LD_ARGS_0() +#define LD_ARGS_1(a0) LD_ARG(6,a0) +#define LD_ARGS_2(a0,a1) LD_ARGS_1(a0); LD_ARG(3,a1) +#define LD_ARGS_3(a0,a1,a2) LD_ARGS_2(a0,a1); LD_ARG(4,a2) +#define LD_ARGS_4(a0,a1,a2,a3) LD_ARGS_3(a0,a1,a2); LD_ARG(5,a3) +#define LD_ARGS_5(a0,a1,a2,a3,a4) LD_ARGS_4(a0,a1,a2,a3); LD_ARG(8,a4) +#define LD_ARGS_6(a0,a1,a2,a3,a4,a5) LD_ARGS_5(a0,a1,a2,a3,a4); LD_ARG(9,a5) + +#define ASM_ARGS_0 "r"(_a2) +#define ASM_ARGS_1 ASM_ARGS_0, "r"(_a6) +#define ASM_ARGS_2 ASM_ARGS_1, "r"(_a3) +#define ASM_ARGS_3 ASM_ARGS_2, "r"(_a4) +#define ASM_ARGS_4 ASM_ARGS_3, "r"(_a5) +#define ASM_ARGS_5 ASM_ARGS_4, "r"(_a8) +#define ASM_ARGS_6 ASM_ARGS_5, "r"(_a9) + +/* Define a macro which expands into the inline wrapper code for a system + call. */ + +#undef INLINE_SYSCALL +#define INLINE_SYSCALL(name, nr, args...) \ + ({ unsigned long resultvar = INTERNAL_SYSCALL (name, , nr, args); \ + if (__builtin_expect (INTERNAL_SYSCALL_ERROR_P (resultvar, ), 0)) \ + { \ + __set_errno (INTERNAL_SYSCALL_ERRNO (resultvar, )); \ + resultvar = (unsigned long) -1; \ + } \ + (long) resultvar; }) + +#undef INTERNAL_SYSCALL_DECL +#define INTERNAL_SYSCALL_DECL(err) do { } while (0) + +#define INTERNAL_SYSCALL_NCS(name, err, nr, args...) \ + ({ LD_ARG(2, name); \ + LD_ARGS_##nr(args); \ + asm volatile ("syscall\n" \ + : "=a" (_a2) \ + : ASM_ARGS_##nr \ + : "memory"); \ + (long) _a2; }) + +#undef INTERNAL_SYSCALL +#define INTERNAL_SYSCALL(name, err, nr, args...) \ + INTERNAL_SYSCALL_NCS (__NR_##name, err, nr, ##args) + +#undef INTERNAL_SYSCALL_ERROR_P +#define INTERNAL_SYSCALL_ERROR_P(val, err) \ + ((unsigned long) (val) >= -4095L) + +#undef INTERNAL_SYSCALL_ERRNO +#define INTERNAL_SYSCALL_ERRNO(val, err) (-(val)) + +#define _syscall0(args...) SYSCALL_FUNC (0, args) +#define _syscall1(args...) SYSCALL_FUNC (1, args) +#define _syscall2(args...) SYSCALL_FUNC (2, args) +#define _syscall3(args...) SYSCALL_FUNC (3, args) +#define _syscall4(args...) SYSCALL_FUNC (4, args) +#define _syscall5(args...) SYSCALL_FUNC (5, args) +#define _syscall6(args...) SYSCALL_FUNC (6, args) + +#define C_DECL_ARGS_0() void +#define C_DECL_ARGS_1(t, v) t v +#define C_DECL_ARGS_2(t, v, args...) t v, C_DECL_ARGS_1(args) +#define C_DECL_ARGS_3(t, v, args...) t v, C_DECL_ARGS_2(args) +#define C_DECL_ARGS_4(t, v, args...) t v, C_DECL_ARGS_3(args) +#define C_DECL_ARGS_5(t, v, args...) t v, C_DECL_ARGS_4(args) +#define C_DECL_ARGS_6(t, v, args...) t v, C_DECL_ARGS_5(args) + +#define C_ARGS_0() +#define C_ARGS_1(t, v) v +#define C_ARGS_2(t, v, args...) v, C_ARGS_1 (args) +#define C_ARGS_3(t, v, args...) v, C_ARGS_2 (args) +#define C_ARGS_4(t, v, args...) v, C_ARGS_3 (args) +#define C_ARGS_5(t, v, args...) v, C_ARGS_4 (args) +#define C_ARGS_6(t, v, args...) v, C_ARGS_5 (args) + +#define SYSCALL_FUNC(nargs, type, name, args...) \ +type name (C_DECL_ARGS_##nargs (args)) { \ + return (type) INLINE_SYSCALL (name, nargs, C_ARGS_##nargs (args)); \ +} + +#endif /* not __ASSEMBLER__ */ +#endif /* _BITS_SYSCALLS_H */ diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/uClibc_arch_features.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/uClibc_arch_features.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/uClibc_arch_features.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/uClibc_arch_features.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,44 @@ +/* + * Track misc arch-specific features that aren't config options + */ + +#ifndef _BITS_UCLIBC_ARCH_FEATURES_H +#define _BITS_UCLIBC_ARCH_FEATURES_H + +/* instruction used when calling abort() to kill yourself */ +#define __UCLIBC_ABORT_INSTRUCTION__ "ill" + +/* can your target use syscall6() for mmap ? */ +#define __UCLIBC_MMAP_HAS_6_ARGS__ + +/* does your target use syscall4() for truncate64 ? (32bit arches only) */ +#undef __UCLIBC_TRUNCATE64_HAS_4_ARGS__ + +/* does your target have a broken create_module() ? */ +#undef __UCLIBC_BROKEN_CREATE_MODULE__ + +/* does your target have to worry about older [gs]etrlimit() ? */ +#undef __UCLIBC_HANDLE_OLDER_RLIMIT__ + +/* does your target prefix all symbols with an _ ? */ +#define __UCLIBC_NO_UNDERSCORES__ + +/* does your target have an asm .set ? */ +#define __UCLIBC_HAVE_ASM_SET_DIRECTIVE__ + +/* define if target doesn't like .global */ +#undef __UCLIBC_ASM_GLOBAL_DIRECTIVE__ + +/* define if target supports .weak */ +#define __UCLIBC_HAVE_ASM_WEAK_DIRECTIVE__ + +/* define if target supports .weakext */ +#undef __UCLIBC_HAVE_ASM_WEAKEXT_DIRECTIVE__ + +/* needed probably only for ppc64 */ +#undef __UCLIBC_HAVE_ASM_GLOBAL_DOT_NAME__ + +/* define if target supports IEEE signed zero floats */ +#define __UCLIBC_HAVE_SIGNED_ZERO__ + +#endif /* _BITS_UCLIBC_ARCH_FEATURES_H */ diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/uClibc_page.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/uClibc_page.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/uClibc_page.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/uClibc_page.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,31 @@ +/* Copyright (C) 2004 Erik Andersen + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * The GNU C Library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with the GNU C Library; if not, write to the Free + * Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + * 02111-1307 USA. + */ + +/* Supply an architecture specific value for PAGE_SIZE and friends. */ + +#ifndef _UCLIBC_PAGE_H +#define _UCLIBC_PAGE_H + +#include + +/* PAGE_SHIFT determines the page size -- in this case 4096 */ +#define PAGE_SHIFT XCHAL_MMU_MIN_PTE_PAGE_SIZE +#define PAGE_SIZE (1UL << PAGE_SHIFT) +#define PAGE_MASK (~(PAGE_SIZE-1)) + +#endif /* _UCLIBC_PAGE_H */ diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/wordsize.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/wordsize.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/wordsize.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/wordsize.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,19 @@ +/* Copyright (C) 1999 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + +#define __WORDSIZE 32 diff -Nurd uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/xtensa-config.h uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/xtensa-config.h --- uClibc-0.9.29.orig/libc/sysdeps/linux/xtensa/bits/xtensa-config.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/sysdeps/linux/xtensa/bits/xtensa-config.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,53 @@ +/* Xtensa configuration settings. + Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007 + Free Software Foundation, Inc. + Contributed by Bob Wilson (bwilson at tensilica.com) at Tensilica. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#ifndef XTENSA_CONFIG_H +#define XTENSA_CONFIG_H + +/* The macros defined here match those with the same names in the Xtensa + compile-time HAL (Hardware Abstraction Layer). Please refer to the + Xtensa System Software Reference Manual for documentation of these + macros. */ + +/* The following macros reflect the default expectations for Xtensa + processor configurations that can run glibc. If you want to try + building glibc for an Xtensa configuration that is missing these + options, you will at least need to change the values of these + macros. */ + +#undef XCHAL_HAVE_NSA +#define XCHAL_HAVE_NSA 1 + +#undef XCHAL_HAVE_LOOPS +#define XCHAL_HAVE_LOOPS 1 + +/* Assume the maximum number of AR registers. This currently only affects + the __window_spill function, and it is always safe to flush extra. */ + +#undef XCHAL_NUM_AREGS +#define XCHAL_NUM_AREGS 64 + +/* Set a default page size. This is currently needed when bootstrapping + the runtime linker. See comments in dl-machine.h where this is used. */ + +#undef XCHAL_MMU_MIN_PTE_PAGE_SIZE +#define XCHAL_MMU_MIN_PTE_PAGE_SIZE 12 + +#endif /* !XTENSA_CONFIG_H */ From chris at zankel.net Thu Dec 6 12:50:11 2007 From: chris at zankel.net (Chris Zankel) Date: Thu, 6 Dec 2007 12:50:11 -0800 Subject: Xtensa support for uClibc [9/9] Message-ID: <20071206205011.B12C53086E@atlanta.zankel.net> Add Xtensa support to uClibc [9/9]: old linuxthreads library support. --- diff -Nurd uClibc-0.9.29.orig/libpthread/linuxthreads.old/sysdeps/xtensa/pt-machine.h uClibc-0.9.29/libpthread/linuxthreads.old/sysdeps/xtensa/pt-machine.h --- uClibc-0.9.29.orig/libpthread/linuxthreads.old/sysdeps/xtensa/pt-machine.h 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libpthread/linuxthreads.old/sysdeps/xtensa/pt-machine.h 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,48 @@ +/* Machine-dependent pthreads configuration and inline functions. + Xtensa version. + + Copyright (C) 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#ifndef _PT_MACHINE_H +#define _PT_MACHINE_H 1 + +#include +#include + +#ifndef PT_EI +# define PT_EI extern inline +#endif + +/* Memory barrier. */ +#define MEMORY_BARRIER() __asm__ ("memw" : : : "memory") + +/* Spinlock implementation; required. */ +PT_EI long int +testandset (int *spinlock) +{ + int unused = 0; + return INTERNAL_SYSCALL (xtensa, , 4, SYS_XTENSA_ATOMIC_SET, + spinlock, 1, unused); +} + +/* Get some notion of the current stack. Need not be exactly the top + of the stack, just something somewhere in the current frame. */ +#define CURRENT_STACK_FRAME __builtin_frame_address (0) + +#endif /* _PT_MACHINE_H */ From chris at zankel.net Thu Dec 6 12:49:47 2007 From: chris at zankel.net (Chris Zankel) Date: Thu, 6 Dec 2007 12:49:47 -0800 Subject: Xtensa support for uClibc [3/9] Message-ID: <20071206204947.D793A3085A@atlanta.zankel.net> Add support for Xtensa to uClibc [3/9]: Optimized string functions for Xtensa. --- diff -Nurd uClibc-0.9.29.orig/libc/string/xtensa/Makefile uClibc-0.9.29/libc/string/xtensa/Makefile --- uClibc-0.9.29.orig/libc/string/xtensa/Makefile 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/string/xtensa/Makefile 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,13 @@ +# Makefile for uClibc +# +# Copyright (C) 2000-2005 Erik Andersen +# +# Licensed under the LGPL v2.1, see the file COPYING.LIB in this tarball. +# + +top_srcdir:=../../../ +top_builddir:=../../../ +all: objs +include $(top_builddir)Rules.mak +include ../Makefile.in +include $(top_srcdir)Makerules diff -Nurd uClibc-0.9.29.orig/libc/string/xtensa/memcpy.S uClibc-0.9.29/libc/string/xtensa/memcpy.S --- uClibc-0.9.29.orig/libc/string/xtensa/memcpy.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/string/xtensa/memcpy.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,297 @@ +/* Optimized memcpy for Xtensa. + Copyright (C) 2001, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#include "../../sysdeps/linux/xtensa/sysdep.h" +#include + + .macro src_b r, w0, w1 +#ifdef __XTENSA_EB__ + src \r, \w0, \w1 +#else + src \r, \w1, \w0 +#endif + .endm + + .macro ssa8 r +#ifdef __XTENSA_EB__ + ssa8b \r +#else + ssa8l \r +#endif + .endm + +/* If the Xtensa Unaligned Load Exception option is not used, this + code can run a few cycles faster by relying on the low address bits + being ignored. However, if the code is then run with an Xtensa ISS + client that checks for unaligned accesses, it will produce a lot of + warning messages. Set this flag to disable the use of unaligned + accesses and keep the ISS happy. */ + +#define UNALIGNED_ADDRESSES_CHECKED 1 + +/* Do not use .literal_position in the ENTRY macro. */ +#undef LITERAL_POSITION +#define LITERAL_POSITION + + +/* void *memcpy (void *dst, const void *src, size_t len) + + The algorithm is as follows: + + If the destination is unaligned, align it by conditionally + copying 1- and/or 2-byte pieces. + + If the source is aligned, copy 16 bytes with a loop, and then finish up + with 8, 4, 2, and 1-byte copies conditional on the length. + + Else (if source is unaligned), do the same, but use SRC to align the + source data. + + This code tries to use fall-through branches for the common + case of aligned source and destination and multiple of 4 (or 8) length. */ + + +/* Byte by byte copy. */ + + .text + .align 4 + .literal_position +__memcpy_aux: + + /* Skip a byte to get 1 mod 4 alignment for LOOPNEZ + (0 mod 4 alignment for LBEG). */ + .byte 0 + +.Lbytecopy: +#if XCHAL_HAVE_LOOPS + loopnez a4, 2f +#else + beqz a4, 2f + add a7, a3, a4 // a7 = end address for source +#endif +1: l8ui a6, a3, 0 + addi a3, a3, 1 + s8i a6, a5, 0 + addi a5, a5, 1 +#if !XCHAL_HAVE_LOOPS + blt a3, a7, 1b +#endif +2: retw + + +/* Destination is unaligned. */ + + .align 4 +.Ldst1mod2: // dst is only byte aligned + + /* Do short copies byte-by-byte. */ + _bltui a4, 7, .Lbytecopy + + /* Copy 1 byte. */ + l8ui a6, a3, 0 + addi a3, a3, 1 + addi a4, a4, -1 + s8i a6, a5, 0 + addi a5, a5, 1 + + /* Return to main algorithm if dst is now aligned. */ + _bbci.l a5, 1, .Ldstaligned + +.Ldst2mod4: // dst has 16-bit alignment + + /* Do short copies byte-by-byte. */ + _bltui a4, 6, .Lbytecopy + + /* Copy 2 bytes. */ + l8ui a6, a3, 0 + l8ui a7, a3, 1 + addi a3, a3, 2 + addi a4, a4, -2 + s8i a6, a5, 0 + s8i a7, a5, 1 + addi a5, a5, 2 + + /* dst is now aligned; return to main algorithm. */ + j .Ldstaligned + + +ENTRY (memcpy) + /* a2 = dst, a3 = src, a4 = len */ + + mov a5, a2 // copy dst so that a2 is return value + _bbsi.l a2, 0, .Ldst1mod2 + _bbsi.l a2, 1, .Ldst2mod4 +.Ldstaligned: + + /* Get number of loop iterations with 16B per iteration. */ + srli a7, a4, 4 + + /* Check if source is aligned. */ + movi a8, 3 + _bany a3, a8, .Lsrcunaligned + + /* Destination and source are word-aligned, use word copy. */ +#if XCHAL_HAVE_LOOPS + loopnez a7, 2f +#else + beqz a7, 2f + slli a8, a7, 4 + add a8, a8, a3 // a8 = end of last 16B source chunk +#endif +1: l32i a6, a3, 0 + l32i a7, a3, 4 + s32i a6, a5, 0 + l32i a6, a3, 8 + s32i a7, a5, 4 + l32i a7, a3, 12 + s32i a6, a5, 8 + addi a3, a3, 16 + s32i a7, a5, 12 + addi a5, a5, 16 +#if !XCHAL_HAVE_LOOPS + blt a3, a8, 1b +#endif + + /* Copy any leftover pieces smaller than 16B. */ +2: bbci.l a4, 3, 3f + + /* Copy 8 bytes. */ + l32i a6, a3, 0 + l32i a7, a3, 4 + addi a3, a3, 8 + s32i a6, a5, 0 + s32i a7, a5, 4 + addi a5, a5, 8 + +3: bbsi.l a4, 2, 4f + bbsi.l a4, 1, 5f + bbsi.l a4, 0, 6f + retw + + /* Copy 4 bytes. */ +4: l32i a6, a3, 0 + addi a3, a3, 4 + s32i a6, a5, 0 + addi a5, a5, 4 + bbsi.l a4, 1, 5f + bbsi.l a4, 0, 6f + retw + + /* Copy 2 bytes. */ +5: l16ui a6, a3, 0 + addi a3, a3, 2 + s16i a6, a5, 0 + addi a5, a5, 2 + bbsi.l a4, 0, 6f + retw + + /* Copy 1 byte. */ +6: l8ui a6, a3, 0 + s8i a6, a5, 0 + +.Ldone: + retw + + +/* Destination is aligned; source is unaligned. */ + + .align 4 +.Lsrcunaligned: + /* Avoid loading anything for zero-length copies. */ + _beqz a4, .Ldone + + /* Copy 16 bytes per iteration for word-aligned dst and + unaligned src. */ + ssa8 a3 // set shift amount from byte offset +#if UNALIGNED_ADDRESSES_CHECKED + and a11, a3, a8 // save unalignment offset for below + sub a3, a3, a11 // align a3 +#endif + l32i a6, a3, 0 // load first word +#if XCHAL_HAVE_LOOPS + loopnez a7, 2f +#else + beqz a7, 2f + slli a10, a7, 4 + add a10, a10, a3 // a10 = end of last 16B source chunk +#endif +1: l32i a7, a3, 4 + l32i a8, a3, 8 + src_b a6, a6, a7 + s32i a6, a5, 0 + l32i a9, a3, 12 + src_b a7, a7, a8 + s32i a7, a5, 4 + l32i a6, a3, 16 + src_b a8, a8, a9 + s32i a8, a5, 8 + addi a3, a3, 16 + src_b a9, a9, a6 + s32i a9, a5, 12 + addi a5, a5, 16 +#if !XCHAL_HAVE_LOOPS + blt a3, a10, 1b +#endif + +2: bbci.l a4, 3, 3f + + /* Copy 8 bytes. */ + l32i a7, a3, 4 + l32i a8, a3, 8 + src_b a6, a6, a7 + s32i a6, a5, 0 + addi a3, a3, 8 + src_b a7, a7, a8 + s32i a7, a5, 4 + addi a5, a5, 8 + mov a6, a8 + +3: bbci.l a4, 2, 4f + + /* Copy 4 bytes. */ + l32i a7, a3, 4 + addi a3, a3, 4 + src_b a6, a6, a7 + s32i a6, a5, 0 + addi a5, a5, 4 + mov a6, a7 +4: +#if UNALIGNED_ADDRESSES_CHECKED + add a3, a3, a11 // readjust a3 with correct misalignment +#endif + bbsi.l a4, 1, 5f + bbsi.l a4, 0, 6f + retw + + /* Copy 2 bytes. */ +5: l8ui a6, a3, 0 + l8ui a7, a3, 1 + addi a3, a3, 2 + s8i a6, a5, 0 + s8i a7, a5, 1 + addi a5, a5, 2 + bbsi.l a4, 0, 6f + retw + + /* Copy 1 byte. */ +6: l8ui a6, a3, 0 + s8i a6, a5, 0 + retw + +libc_hidden_def (memcpy) diff -Nurd uClibc-0.9.29.orig/libc/string/xtensa/memset.S uClibc-0.9.29/libc/string/xtensa/memset.S --- uClibc-0.9.29.orig/libc/string/xtensa/memset.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/string/xtensa/memset.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,165 @@ +/* Optimized memset for Xtensa. + Copyright (C) 2001, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#include "../../sysdeps/linux/xtensa/sysdep.h" +#include + +/* Do not use .literal_position in the ENTRY macro. */ +#undef LITERAL_POSITION +#define LITERAL_POSITION + +/* void *memset (void *dst, int c, size_t length) + + The algorithm is as follows: + + Create a word with c in all byte positions. + + If the destination is aligned, set 16B chunks with a loop, and then + finish up with 8B, 4B, 2B, and 1B stores conditional on the length. + + If the destination is unaligned, align it by conditionally + setting 1B and/or 2B and then go to aligned case. + + This code tries to use fall-through branches for the common + case of an aligned destination (except for the branches to + the alignment labels). */ + + +/* Byte-by-byte set. */ + + .text + .align 4 + .literal_position +__memset_aux: + + /* Skip a byte to get 1 mod 4 alignment for LOOPNEZ + (0 mod 4 alignment for LBEG). */ + .byte 0 + +.Lbyteset: +#if XCHAL_HAVE_LOOPS + loopnez a4, 2f +#else + beqz a4, 2f + add a6, a5, a4 // a6 = ending address +#endif +1: s8i a3, a5, 0 + addi a5, a5, 1 +#if !XCHAL_HAVE_LOOPS + blt a5, a6, 1b +#endif +2: retw + + +/* Destination is unaligned. */ + + .align 4 + +.Ldst1mod2: // dst is only byte aligned + + /* Do short sizes byte-by-byte. */ + bltui a4, 8, .Lbyteset + + /* Set 1 byte. */ + s8i a3, a5, 0 + addi a5, a5, 1 + addi a4, a4, -1 + + /* Now retest if dst is aligned. */ + _bbci.l a5, 1, .Ldstaligned + +.Ldst2mod4: // dst has 16-bit alignment + + /* Do short sizes byte-by-byte. */ + bltui a4, 8, .Lbyteset + + /* Set 2 bytes. */ + s16i a3, a5, 0 + addi a5, a5, 2 + addi a4, a4, -2 + + /* dst is now aligned; return to main algorithm */ + j .Ldstaligned + + +ENTRY (memset) + /* a2 = dst, a3 = c, a4 = length */ + + /* Duplicate character into all bytes of word. */ + extui a3, a3, 0, 8 + slli a7, a3, 8 + or a3, a3, a7 + slli a7, a3, 16 + or a3, a3, a7 + + mov a5, a2 // copy dst so that a2 is return value + + /* Check if dst is unaligned. */ + _bbsi.l a2, 0, .Ldst1mod2 + _bbsi.l a2, 1, .Ldst2mod4 +.Ldstaligned: + + /* Get number of loop iterations with 16B per iteration. */ + srli a7, a4, 4 + + /* Destination is word-aligned. */ +#if XCHAL_HAVE_LOOPS + loopnez a7, 2f +#else + beqz a7, 2f + slli a6, a7, 4 + add a6, a6, a5 // a6 = end of last 16B chunk +#endif + /* Set 16 bytes per iteration. */ +1: s32i a3, a5, 0 + s32i a3, a5, 4 + s32i a3, a5, 8 + s32i a3, a5, 12 + addi a5, a5, 16 +#if !XCHAL_HAVE_LOOPS + blt a5, a6, 1b +#endif + + /* Set any leftover pieces smaller than 16B. */ +2: bbci.l a4, 3, 3f + + /* Set 8 bytes. */ + s32i a3, a5, 0 + s32i a3, a5, 4 + addi a5, a5, 8 + +3: bbci.l a4, 2, 4f + + /* Set 4 bytes. */ + s32i a3, a5, 0 + addi a5, a5, 4 + +4: bbci.l a4, 1, 5f + + /* Set 2 bytes. */ + s16i a3, a5, 0 + addi a5, a5, 2 + +5: bbci.l a4, 0, 6f + + /* Set 1 byte. */ + s8i a3, a5, 0 +6: retw + +libc_hidden_def (memset) diff -Nurd uClibc-0.9.29.orig/libc/string/xtensa/strcmp.S uClibc-0.9.29/libc/string/xtensa/strcmp.S --- uClibc-0.9.29.orig/libc/string/xtensa/strcmp.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/string/xtensa/strcmp.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,313 @@ +/* Optimized strcmp for Xtensa. + Copyright (C) 2001, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#include "../../sysdeps/linux/xtensa/sysdep.h" +#include + +#ifdef __XTENSA_EB__ +#define MASK0 0xff000000 +#define MASK1 0x00ff0000 +#define MASK2 0x0000ff00 +#define MASK3 0x000000ff +#else +#define MASK0 0x000000ff +#define MASK1 0x0000ff00 +#define MASK2 0x00ff0000 +#define MASK3 0xff000000 +#endif + +#define MASK4 0x40404040 + + .literal .Lmask0, MASK0 + .literal .Lmask1, MASK1 + .literal .Lmask2, MASK2 + .literal .Lmask3, MASK3 + .literal .Lmask4, MASK4 + + .text +ENTRY (strcmp) + /* a2 = s1, a3 = s2 */ + + l8ui a8, a2, 0 // byte 0 from s1 + l8ui a9, a3, 0 // byte 0 from s2 + movi a10, 3 // mask + bne a8, a9, .Lretdiff + + or a11, a2, a3 + bnone a11, a10, .Laligned + + xor a11, a2, a3 // compare low two bits of s1 and s2 + bany a11, a10, .Lunaligned // if they have different alignment + + /* s1/s2 are not word-aligned. */ + addi a2, a2, 1 // advance s1 + beqz a8, .Leq // bytes equal, if zero, strings are equal + addi a3, a3, 1 // advance s2 + bnone a2, a10, .Laligned // if s1/s2 now aligned + l8ui a8, a2, 0 // byte 1 from s1 + l8ui a9, a3, 0 // byte 1 from s2 + addi a2, a2, 1 // advance s1 + bne a8, a9, .Lretdiff // if different, return difference + beqz a8, .Leq // bytes equal, if zero, strings are equal + addi a3, a3, 1 // advance s2 + bnone a2, a10, .Laligned // if s1/s2 now aligned + l8ui a8, a2, 0 // byte 2 from s1 + l8ui a9, a3, 0 // byte 2 from s2 + addi a2, a2, 1 // advance s1 + bne a8, a9, .Lretdiff // if different, return difference + beqz a8, .Leq // bytes equal, if zero, strings are equal + addi a3, a3, 1 // advance s2 + j .Laligned + +/* s1 and s2 have different alignment. + + If the zero-overhead loop option is available, use an (almost) + infinite zero-overhead loop with conditional exits so we only pay + for taken branches when exiting the loop. + + Note: It is important for this unaligned case to come before the + code for aligned strings, because otherwise some of the branches + above cannot reach and have to be transformed to branches around + jumps. The unaligned code is smaller and the branches can reach + over it. */ + + .align 4 + /* (2 mod 4) alignment for loop instruction */ +.Lunaligned: +#if XCHAL_HAVE_LOOPS + _movi.n a8, 0 // set up for the maximum loop count + loop a8, .Lretdiff // loop forever (almost anyway) +#endif +.Lnextbyte: + l8ui a8, a2, 0 + l8ui a9, a3, 0 + addi a2, a2, 1 + bne a8, a9, .Lretdiff + addi a3, a3, 1 +#if XCHAL_HAVE_LOOPS + beqz a8, .Lretdiff +#else + bnez a8, .Lnextbyte +#endif +.Lretdiff: + sub a2, a8, a9 + retw + +/* s1 is word-aligned; s2 is word-aligned. + + If the zero-overhead loop option is available, use an (almost) + infinite zero-overhead loop with conditional exits so we only pay + for taken branches when exiting the loop. */ + +/* New algorithm, relying on the fact that all normal ASCII is between + 32 and 127. + + Rather than check all bytes for zero: + Take one word (4 bytes). Call it w1. + Shift w1 left by one into w1'. + Or w1 and w1'. For all normal ASCII bit 6 will be 1; for zero it won't. + Check that all 4 bit 6's (one for each byte) are one: + If they are, we are definitely not done. + If they are not, we are probably done, but need to check for zero. */ + + .align 4 +#if XCHAL_HAVE_LOOPS +.Laligned: + .begin no-transform + l32r a4, .Lmask0 // mask for byte 0 + l32r a7, .Lmask4 + /* Loop forever. (a4 is more than than the maximum number + of iterations) */ + loop a4, .Laligned_done + + /* First unrolled loop body. */ + l32i a8, a2, 0 // get word from s1 + l32i a9, a3, 0 // get word from s2 + slli a5, a8, 1 + bne a8, a9, .Lwne2 + or a9, a8, a5 + bnall a9, a7, .Lprobeq + + /* Second unrolled loop body. */ + l32i a8, a2, 4 // get word from s1+4 + l32i a9, a3, 4 // get word from s2+4 + slli a5, a8, 1 + bne a8, a9, .Lwne2 + or a9, a8, a5 + bnall a9, a7, .Lprobeq2 + + addi a2, a2, 8 // advance s1 pointer + addi a3, a3, 8 // advance s2 pointer +.Laligned_done: + or a1, a1, a1 // nop + +.Lprobeq2: + /* Adjust pointers to account for the loop unrolling. */ + addi a2, a2, 4 + addi a3, a3, 4 + +#else /* !XCHAL_HAVE_LOOPS */ + +.Laligned: + movi a4, MASK0 // mask for byte 0 + movi a7, MASK4 + j .Lfirstword +.Lnextword: + addi a2, a2, 4 // advance s1 pointer + addi a3, a3, 4 // advance s2 pointer +.Lfirstword: + l32i a8, a2, 0 // get word from s1 + l32i a9, a3, 0 // get word from s2 + slli a5, a8, 1 + bne a8, a9, .Lwne2 + or a9, a8, a5 + ball a9, a7, .Lnextword +#endif /* !XCHAL_HAVE_LOOPS */ + + /* align (0 mod 4) */ +.Lprobeq: + /* Words are probably equal, but check for sure. + If not, loop over the rest of string using normal algorithm. */ + + bnone a8, a4, .Leq // if byte 0 is zero + l32r a5, .Lmask1 // mask for byte 1 + l32r a6, .Lmask2 // mask for byte 2 + bnone a8, a5, .Leq // if byte 1 is zero + l32r a7, .Lmask3 // mask for byte 3 + bnone a8, a6, .Leq // if byte 2 is zero + bnone a8, a7, .Leq // if byte 3 is zero + addi.n a2, a2, 4 // advance s1 pointer + addi.n a3, a3, 4 // advance s2 pointer +#if XCHAL_HAVE_LOOPS + + /* align (1 mod 4) */ + loop a4, .Leq // loop forever (a4 is bigger than max iters) + .end no-transform + + l32i a8, a2, 0 // get word from s1 + l32i a9, a3, 0 // get word from s2 + addi a2, a2, 4 // advance s1 pointer + bne a8, a9, .Lwne + bnone a8, a4, .Leq // if byte 0 is zero + bnone a8, a5, .Leq // if byte 1 is zero + bnone a8, a6, .Leq // if byte 2 is zero + bnone a8, a7, .Leq // if byte 3 is zero + addi a3, a3, 4 // advance s2 pointer + +#else /* !XCHAL_HAVE_LOOPS */ + + j .Lfirstword2 +.Lnextword2: + addi a3, a3, 4 // advance s2 pointer +.Lfirstword2: + l32i a8, a2, 0 // get word from s1 + l32i a9, a3, 0 // get word from s2 + addi a2, a2, 4 // advance s1 pointer + bne a8, a9, .Lwne + bnone a8, a4, .Leq // if byte 0 is zero + bnone a8, a5, .Leq // if byte 1 is zero + bnone a8, a6, .Leq // if byte 2 is zero + bany a8, a7, .Lnextword2 // if byte 3 is zero +#endif /* !XCHAL_HAVE_LOOPS */ + + /* Words are equal; some byte is zero. */ +.Leq: movi a2, 0 // return equal + retw + +.Lwne2: /* Words are not equal. On big-endian processors, if none of the + bytes are zero, the return value can be determined by a simple + comparison. */ +#ifdef __XTENSA_EB__ + or a10, a8, a5 + bnall a10, a7, .Lsomezero + bgeu a8, a9, .Lposreturn + movi a2, -1 + retw +.Lposreturn: + movi a2, 1 + retw +.Lsomezero: // There is probably some zero byte. +#endif /* __XTENSA_EB__ */ +.Lwne: /* Words are not equal. */ + xor a2, a8, a9 // get word with nonzero in byte that differs + bany a2, a4, .Ldiff0 // if byte 0 differs + movi a5, MASK1 // mask for byte 1 + bnone a8, a4, .Leq // if byte 0 is zero + bany a2, a5, .Ldiff1 // if byte 1 differs + movi a6, MASK2 // mask for byte 2 + bnone a8, a5, .Leq // if byte 1 is zero + bany a2, a6, .Ldiff2 // if byte 2 differs + bnone a8, a6, .Leq // if byte 2 is zero +#ifdef __XTENSA_EB__ +.Ldiff3: +.Ldiff2: +.Ldiff1: + /* Byte 0 is equal (at least) and there is a difference before a zero + byte. Just subtract words to get the return value. + The high order equal bytes cancel, leaving room for the sign. */ + sub a2, a8, a9 + retw + +.Ldiff0: + /* Need to make room for the sign, so can't subtract whole words. */ + extui a10, a8, 24, 8 + extui a11, a9, 24, 8 + sub a2, a10, a11 + retw + +#else /* !__XTENSA_EB__ */ + /* Little-endian is a little more difficult because can't subtract + whole words. */ +.Ldiff3: + /* Bytes 0-2 are equal; byte 3 is different. + For little-endian need to have a sign bit for the difference. */ + extui a10, a8, 24, 8 + extui a11, a9, 24, 8 + sub a2, a10, a11 + retw + +.Ldiff0: + /* Byte 0 is different. */ + extui a10, a8, 0, 8 + extui a11, a9, 0, 8 + sub a2, a10, a11 + retw + +.Ldiff1: + /* Byte 0 is equal; byte 1 is different. */ + extui a10, a8, 8, 8 + extui a11, a9, 8, 8 + sub a2, a10, a11 + retw + +.Ldiff2: + /* Bytes 0-1 are equal; byte 2 is different. */ + extui a10, a8, 16, 8 + extui a11, a9, 16, 8 + sub a2, a10, a11 + retw + +#endif /* !__XTENSA_EB */ + +libc_hidden_def (strcmp) + +#ifndef __UCLIBC_HAS_LOCALE__ +strong_alias (strcmp, strcoll) +libc_hidden_def (strcoll) +#endif diff -Nurd uClibc-0.9.29.orig/libc/string/xtensa/strcpy.S uClibc-0.9.29/libc/string/xtensa/strcpy.S --- uClibc-0.9.29.orig/libc/string/xtensa/strcpy.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/string/xtensa/strcpy.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,150 @@ +/* Optimized strcpy for Xtensa. + Copyright (C) 2001, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 51 Franklin Street - Fifth Floor, + Boston, MA 02110-1301, USA. */ + +#include "../../sysdeps/linux/xtensa/sysdep.h" +#include + +#ifdef __XTENSA_EB__ +#define MASK0 0xff000000 +#define MASK1 0x00ff0000 +#define MASK2 0x0000ff00 +#define MASK3 0x000000ff +#else +#define MASK0 0x000000ff +#define MASK1 0x0000ff00 +#define MASK2 0x00ff0000 +#define MASK3 0xff000000 +#endif + + .text +ENTRY (strcpy) + /* a2 = dst, a3 = src */ + + mov a10, a2 // leave dst in return value register + movi a4, MASK0 + movi a5, MASK1 + movi a6, MASK2 + movi a7, MASK3 + bbsi.l a3, 0, .Lsrc1mod2 + bbsi.l a3, 1, .Lsrc2mod4 +.Lsrcaligned: + + /* Check if the destination is aligned. */ + movi a8, 3 + bnone a10, a8, .Laligned + + j .Ldstunaligned + +.Lsrc1mod2: // src address is odd + l8ui a8, a3, 0 // get byte 0 + addi a3, a3, 1 // advance src pointer + s8i a8, a10, 0 // store byte 0 + beqz a8, 1f // if byte 0 is zero + addi a10, a10, 1 // advance dst pointer + bbci.l a3, 1, .Lsrcaligned // if src is now word-aligned + +.Lsrc2mod4: // src address is 2 mod 4 + l8ui a8, a3, 0 // get byte 0 + /* 1-cycle interlock */ + s8i a8, a10, 0 // store byte 0 + beqz a8, 1f // if byte 0 is zero + l8ui a8, a3, 1 // get byte 0 + addi a3, a3, 2 // advance src pointer + s8i a8, a10, 1 // store byte 0 + addi a10, a10, 2 // advance dst pointer + bnez a8, .Lsrcaligned +1: retw + + +/* dst is word-aligned; src is word-aligned. */ + + .align 4 +#if XCHAL_HAVE_LOOPS + /* (2 mod 4) alignment for loop instruction */ +.Laligned: + _movi.n a8, 0 // set up for the maximum loop count + loop a8, .Lz3 // loop forever (almost anyway) + l32i a8, a3, 0 // get word from src + addi a3, a3, 4 // advance src pointer + bnone a8, a4, .Lz0 // if byte 0 is zero + bnone a8, a5, .Lz1 // if byte 1 is zero + bnone a8, a6, .Lz2 // if byte 2 is zero + s32i a8, a10, 0 // store word to dst + bnone a8, a7, .Lz3 // if byte 3 is zero + addi a10, a10, 4 // advance dst pointer + +#else /* !XCHAL_HAVE_LOOPS */ + +1: addi a10, a10, 4 // advance dst pointer +.Laligned: + l32i a8, a3, 0 // get word from src + addi a3, a3, 4 // advance src pointer + bnone a8, a4, .Lz0 // if byte 0 is zero + bnone a8, a5, .Lz1 // if byte 1 is zero + bnone a8, a6, .Lz2 // if byte 2 is zero + s32i a8, a10, 0 // store word to dst + bany a8, a7, 1b // if byte 3 is zero +#endif /* !XCHAL_HAVE_LOOPS */ + +.Lz3: /* Byte 3 is zero. */ + retw + +.Lz0: /* Byte 0 is zero. */ +#ifdef __XTENSA_EB__ + movi a8, 0 +#endif + s8i a8, a10, 0 + retw + +.Lz1: /* Byte 1 is zero. */ +#ifdef __XTENSA_EB__ + extui a8, a8, 16, 16 +#endif + s16i a8, a10, 0 + retw + +.Lz2: /* Byte 2 is zero. */ +#ifdef __XTENSA_EB__ + extui a8, a8, 16, 16 +#endif + s16i a8, a10, 0 + movi a8, 0 + s8i a8, a10, 2 + retw + + .align 4 + /* (2 mod 4) alignment for loop instruction */ +.Ldstunaligned: + +#if XCHAL_HAVE_LOOPS + _movi.n a8, 0 // set up for the maximum loop count + loop a8, 2f // loop forever (almost anyway) +#endif +1: l8ui a8, a3, 0 + addi a3, a3, 1 + s8i a8, a10, 0 + addi a10, a10, 1 +#if XCHAL_HAVE_LOOPS + beqz a8, 2f +#else + bnez a8, 1b +#endif +2: retw + +libc_hidden_def (strcpy) diff -Nurd uClibc-0.9.29.orig/libc/string/xtensa/strlen.S uClibc-0.9.29/libc/string/xtensa/strlen.S --- uClibc-0.9.29.orig/libc/string/xtensa/strlen.S 1969-12-31 16:00:00.000000000 -0800 +++ uClibc-0.9.29/libc/string/xtensa/strlen.S 2007-12-04 11:38:00.000000000 -0800 @@ -0,0 +1,104 @@ +/* Optimized strlen for Xtensa. + Copyright (C) 2001, 2007 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with t