| Topic: |
DEVELOP > c-Plus-Plus |
| User: |
"Geometer" |
| Date: |
06 May 2006 09:04:30 AM |
| Object: |
strtok behavior with multiple consecutive delimiters |
Hello, and good whatever daytime is at your place..
please can somebody tell me, what the standard behavior of strtok shall be,
if it encounters two or more consecutive delimiters like in
(checks omitted)
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
^^^^ ^^^^^^
char *tok = strtok(tst, "\n");
tok = strtok(NULL, "\n");
and so on..
will the groups of '\n' marked above be consumed one by one or the whole
group together?
Thank you very much
.
|
|
| User: "Ben Pfaff" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 12:10:34 PM |
|
|
"Geometer" <n@n.com> writes:
please can somebody tell me, what the standard behavior of strtok shall be,
if it encounters two or more consecutive delimiters like in
strtok() has at least these problems:
* It merges adjacent delimiters. If you use a comma as your
delimiter, then "a,,b,c" will be divided into three tokens,
not four. This is often the wrong thing to do. In fact, it
is only the right thing to do, in my experience, when the
delimiter set contains white space (for dividing a string
into "words") or it is known in advance that there will be
no adjacent delimiters.
* The identity of the delimiter is lost, because it is
changed to a null terminator.
* It modifies the string that it tokenizes. This is bad
because it forces you to make a copy of the string if
you want to use it later. It also means that you can't
tokenize a string literal with it; this is not
necessarily something you'd want to do all the time but
it is surprising.
* It can only be used once at a time. If a sequence of
strtok() calls is ongoing and another one is started,
the state of the first one is lost. This isn't a
problem for small programs but it is easy to lose track
of such things in hierarchies of nested functions in
large programs. In other words, strtok() breaks
encapsulation.
--
"What is appropriate for the master is not appropriate for the novice.
You must understand the Tao before transcending structure."
--The Tao of Programming
.
|
|
|
|
| User: "Peter Jansson" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 09:27:09 AM |
|
|
Geometer wrote:
Hello, and good whatever daytime is at your place..
please can somebody tell me, what the standard behavior of strtok shall be,
if it encounters two or more consecutive delimiters like in
(checks omitted)
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
^^^^ ^^^^^^
char *tok = strtok(tst, "\n");
tok = strtok(NULL, "\n");
and so on..
will the groups of '\n' marked above be consumed one by one or the whole
group together?
Thank you very much
<quote src="A man-page for strok.">
Never use these functions. If you do, note that:
These functions modify their first argument.
These functions cannot be used on constant strings.
The identity of the delimiting character is lost.
The strtok() function uses a static buffer while parsing,
so it’s not thread safe.
</quote>
Regards,
Peter Jansson
http://www.p-jansson.com/
http://www.jansson.net/
.
|
|
|
| User: "Jerry Coffin" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 11:17:58 AM |
|
|
In article <1n27g.55903$d5.210494@newsb.telia.net>,
webmaster@jansson.net says...
[ ... ]
The strtok() function uses a static buffer while parsing,
so it’s not thread safe.
More accurately, it uses a static pointer while parsing,
so the vendor has to go to extra work to make it thread
safe. The same is true with a number of other functions
as well, though -- much of what's defined in time.h, to
give only one obvious example.
--
Later,
Jerry.
The universe is a figment of its own imagination.
.
|
|
|
|
| User: "Pete Becker" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 01:29:11 PM |
|
|
Peter Jansson wrote:
<quote src="A man-page for strok.">
The name of the function is strtok.
Never use these functions. If you do, note that:
These functions modify their first argument.
These functions cannot be used on constant strings.
These two say the same thing. Sounds like someone is trying too hard.
The identity of the delimiting character is lost.
Which has nothing to do with the claim that you should never use it. You
shouldn't use it if you need to know which of the delimiters was
actually encountered.
The strtok() function uses a static buffer while parsing,
No, it uses a static variable to hold its result BETWEEN calls.
so it’s not thread safe.
Non sequitur. It's easy enough to implement with a per-thread static
pointer, which is thread safe.
Yup, definitely trying too hard. strtok is well suited for what it does.
If you need something more elaborate, go for it.
--
Pete Becker
Roundhouse Consulting, Ltd.
.
|
|
|
|
| User: "CBFalconer" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 10:11:21 AM |
|
|
Peter Jansson wrote:
Geometer wrote:
please can somebody tell me, what the standard behavior of
strtok shall be, if it encounters two or more consecutive
delimiters like in (checks omitted)
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
^^^^ ^^^^^^
char *tok = strtok(tst, "\n");
tok = strtok(NULL, "\n");
and so on..
will the groups of '\n' marked above be consumed one by one or
the whole group together?
<quote src="A man-page for strok.">
Never use these functions. If you do, note that:
These functions modify their first argument.
These functions cannot be used on constant strings.
The identity of the delimiting character is lost.
The strtok() function uses a static buffer while parsing,
so it’s not thread safe.
</quote>
The OP can simply use the following replacement function, which
does not have those objectionable features. The testing code is
longer than the function.
/* ------- file toksplit.c ----------*/
#include "toksplit.h"
/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.
The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.
Returns: a pointer past the terminating tokchar.
This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.
A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.
released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
*/
const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) *src++;
while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
}
*token = '\0';
return src;
} /* toksplit */
#ifdef TESTING
#include <stdio.h>
#define ABRsize 6 /* length of acceptable token abbreviations */
int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";
const char *t, *s = teststring;
int i;
char token[ABRsize + 1];
puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = toksplit(t, ',', token, ABRsize);
putchar(i + '1'); putchar(':');
puts(token);
}
puts("\nHow to detect 'no more tokens'");
t = s; i = 0;
while (*t) {
t = toksplit(t, ',', token, 3);
putchar(i + '1'); putchar(':');
puts(token);
i++;
}
puts("\nUsing blanks as token delimiters");
t = s; i = 0;
while (*t) {
t = toksplit(t, ' ', token, ABRsize);
putchar(i + '1'); putchar(':');
puts(token);
i++;
}
return 0;
} /* main */
#endif
/* ------- end file toksplit.c ----------*/
I have set follow-ups to exclude c.l.c++. Although the above code
is usable there, it is seldom a good idea to mix the two
languages. I have not provided a header file with a C++ linkage
provision.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
.
|
|
|
| User: "" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 01:02:38 PM |
|
|
CBFalconer wrote:
The OP can simply use the following replacement function, which
does not have those objectionable features. The testing code is
longer than the function.
OTOH By using C++ life becomes more productive, less error prone,
less complicated and more elegant:
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
int main()
{
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
std::stringstream s;
s << tst;
std::vector<std::string> tokens;
while (! s.eof() ){
std::string str;
getline(s,str,'\n');
tokens.push_back(str);
}
for (std::vector<std::string>::const_iterator iter
= tokens.begin();
iter !=tokens.end();
++iter){
std::cout << "token: \""<< *iter <<"\"\n";
}
}
regards
Andy Little
.
|
|
|
| User: "jacob navia" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 02:15:26 PM |
|
|
a écrit :
CBFalconer wrote:
The OP can simply use the following replacement function, which
does not have those objectionable features. The testing code is
longer than the function.
OTOH By using C++ life becomes more productive, less error prone,
less complicated and more elegant:
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
int main()
{
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
std::stringstream s;
s << tst;
std::vector<std::string> tokens;
while (! s.eof() ){
std::string str;
getline(s,str,'\n');
tokens.push_back(str);
}
for (std::vector<std::string>::const_iterator iter
= tokens.begin();
iter !=tokens.end();
++iter){
std::cout << "token: \""<< *iter <<"\"\n";
}
}
regards
Andy Little
I compiled your program in C++ using the VS 2005 compiler. The
executable size of that stuff was 180 224 bytes.
Then I compiled Chuck's version using his strtok function using the
lcc-win32 compiler (a C compiler, not a C++ one). The size was 14 645 bytes.
Then I eliminated output from both programs. Compiled them without any
optimizations and inserted a loop of 1 million times.
C++ took 1.234 seconds
C took 0.375 seconds
Then I compiled both programs using VS 2005 (64 bits) with full
optimization:
C++ took 0.234 seconds
C took 0.156 seconds
I do not say that this measurements are important for everybody. But
maybe they are important for *some* people.
jacob
.
|
|
|
| User: "Ben C" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 04:48:08 PM |
|
|
On 2006-05-06, jacob navia <jacob@jacob.remcomp.fr> wrote:
[...]
I compiled your program in C++ using the VS 2005 compiler. The
executable size of that stuff was 180 224 bytes.
Then I compiled Chuck's version using his strtok function using the
lcc-win32 compiler (a C compiler, not a C++ one). The size was 14 645 bytes.
Then I eliminated output from both programs. Compiled them without any
optimizations and inserted a loop of 1 million times.
C++ took 1.234 seconds
C took 0.375 seconds
Then I compiled both programs using VS 2005 (64 bits) with full
optimization:
C++ took 0.234 seconds
C took 0.156 seconds
I do not say that this measurements are important for everybody. But
maybe they are important for *some* people.
Interesting. Can you do a timing of VS 2005 with full optimizations on
the C version? I think this would complete the picture.
.
|
|
|
| User: "jacob navia" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
07 May 2006 12:24:21 AM |
|
|
Ben C a écrit :
On 2006-05-06, jacob navia <jacob@jacob.remcomp.fr> wrote:
[...]
I compiled your program in C++ using the VS 2005 compiler. The
executable size of that stuff was 180 224 bytes.
Then I compiled Chuck's version using his strtok function using the
lcc-win32 compiler (a C compiler, not a C++ one). The size was 14 645 bytes.
Then I eliminated output from both programs. Compiled them without any
optimizations and inserted a loop of 1 million times.
C++ took 1.234 seconds
C took 0.375 seconds
Then I compiled both programs using VS 2005 (64 bits) with full
optimization:
C++ took 0.234 seconds
C took 0.156 seconds
I do not say that this measurements are important for everybody. But
maybe they are important for *some* people.
Interesting. Can you do a timing of VS 2005 with full optimizations on
the C version? I think this would complete the picture.
I did it: is the measurement above:
compiled both with VS 2005 and the result was C++ 0.234 sec, C 0.156 sec
.
|
|
|
|
| User: "maaxiim" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 05:25:39 PM |
|
|
I'm surprised nobody has mentioned the ISO extension function strtok_r.
char *strtok_r(char *restrict s, const char *restrict sep, char
**restrict lasts);
<quote src="EEE Std 1003.1, 2004 Edition">
The strtok_r() function considers the null-terminated string s as a
sequence of zero or more text tokens separated by spans of one or more
characters from the separator string sep. The argument lasts points to
a user-provided pointer which points to stored information necessary
for strtok_r() to continue scanning the same string.
In the first call to strtok_r(), s points to a null-terminated string,
sep to a null-terminated string of separator characters, and the value
pointed to by lasts is ignored. The strtok_r() function shall return a
pointer to the first character of the first token, write a null
character into s immediately following the returned token, and update
the pointer to which lasts points.
In subsequent calls, s is a NULL pointer and lasts shall be unchanged
from the previous call so that subsequent calls shall move through the
string s, returning successive tokens until no tokens remain. The
separator string sep may be different from call to call. When no token
remains in s, a NULL pointer shall be returned. [Option End]
</quote>
If you care to see an example of it in action, refer to:
http://www.opengroup.org/onlinepubs/000095399/functions/strtok.html
.
|
|
|
|
|
| User: "CBFalconer" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 07:59:47 PM |
|
|
jacob navia wrote:
.... snip ...
I compiled your program in C++ using the VS 2005 compiler. The
executable size of that stuff was 180 224 bytes.
Then I compiled Chuck's version using his strtok function using the
lcc-win32 compiler (a C compiler, not a C++ one). The size was
14 645 bytes.
I compiled toksplit without the testing code, using gcc -Os, and
the generated object code was 0x5b bytes long. That's less than
100 bytes of object code.
The point is: measure the routine, not the testing program.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
.
|
|
|
| User: "jacob navia" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
07 May 2006 12:22:07 AM |
|
|
CBFalconer a écrit :
jacob navia wrote:
... snip ...
I compiled your program in C++ using the VS 2005 compiler. The
executable size of that stuff was 180 224 bytes.
Then I compiled Chuck's version using his strtok function using the
lcc-win32 compiler (a C compiler, not a C++ one). The size was
14 645 bytes.
I compiled toksplit without the testing code, using gcc -Os, and
the generated object code was 0x5b bytes long. That's less than
100 bytes of object code.
The point is: measure the routine, not the testing program.
OK. With lcc-win32 the size of the toksplit routine is 84 bytes,
or 0x54 if you prefer hexa
.
|
|
|
|
|
| User: "" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 03:42:53 PM |
|
|
jacob navia wrote:
I compiled your program in C++ using the VS 2005 compiler. The
executable size of that stuff was 180 224 bytes.
It comes out at around 112 K for me. What were your command line
options?
Then I compiled Chuck's version using his strtok function using the
lcc-win32 compiler (a C compiler, not a C++ one). The size was 14 645 bytes.
Then I eliminated output from both programs. Compiled them without any
optimizations and inserted a loop of 1 million times.
C++ took 1.234 seconds
C took 0.375 seconds
Then I compiled both programs using VS 2005 (64 bits) with full
optimization:
C++ took 0.234 seconds
C took 0.156 seconds
(It would be nice to see the full source code that you were testing
FWIW). C++ version did rather better than I would expect, good
optimiser! ...;-)
I do not say that this measurements are important for everybody. But
maybe they are important for *some* people.
Sure, C++ will handle the C-style code as well if necessary, but the
amount of time you need to spend writing, testing and debugging is a
major factor to some people too.
And of course ... In what real situation are you going to be spending a
long time tokenising string literals?
regards
Andy Little
.
|
|
|
| User: "jacob navia" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 03:48:35 PM |
|
|
Command line for non optimized version:
cl /EHsc toksplit.cpp
lc toksplit.c
Command line for the optimized version:
cl /Ox /EHsc toksplit.cpp
cl /OX toksplit.c
Here is the code
--------------------------------------------------toksplit.h
#ifndef H_toksplit_h
# define H_toksplit_h
# ifdef __cplusplus
extern "C" {
# endif
#include <stddef.h>
/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.
The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.
Returns: a pointer past the terminating tokchar.
This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.
released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
*/
const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh); /* length token can receive */
/* not including final '\0' */
# ifdef __cplusplus
}
# endif
#endif
--------------------------------------------end of toksplit.h
Now toksplit.c
/* ------- file toksplit.c ----------*/
#include "toksplit.h"
/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.
The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.
Returns: a pointer past the terminating tokchar.
This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.
A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.
released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
*/
const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) *src++;
while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
}
*token = '\0';
return src;
} /* toksplit */
#include <stdio.h>
#define ABRsize 64 /* length of acceptable token abbreviations */
int main(void)
{
char teststring[] = "this\nis\n\nan\nempty\n\n\nline";
const char *t, *s = teststring;
int i;
char token[ABRsize + 1];
int count;
count=0;
do {
t = s; i = 0;
while (*t) {
t = toksplit(t, '\n', token, 64);
//putchar(i + '1'); putchar(':');
//puts(token);
i++;
}
count++;
} while (count < 1000000);
return 0;
} /* main */
--------------------------------------------------------------toksplit.c
Now the C++ version:
--------------------------------------------------------------toksplit.cpp
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
int main()
{
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
std::stringstream s;
s << tst;
std::vector<std::string> tokens;
int count=0;
do {
s << tst;
while (! s.eof() ){
std::string str;
getline(s,str,'\n');
tokens.push_back(str);
}
for (std::vector<std::string>::const_iterator iter
= tokens.begin();
iter !=tokens.end();
++iter){
//std::cout << "token: \""<< *iter <<"\"\n";
}
count++;
} while (count < 1000000);
}
--------------------------------------------------------------end of
toksplit.cpp
.
|
|
|
| User: "Ian Collins" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 05:42:25 PM |
|
|
jacob navia wrote:
Now the C++ version:
--------------------------------------------------------------toksplit.cpp
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
int main()
{
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
std::stringstream s;
s << tst;
std::vector<std::string> tokens;
int count=0;
do {
s << tst;
while (! s.eof() ){
std::string str;
getline(s,str,'\n');
tokens.push_back(str);
}
for (std::vector<std::string>::const_iterator iter
= tokens.begin();
iter !=tokens.end();
++iter){
//std::cout << "token: \""<< *iter <<"\"\n";
}
count++;
} while (count < 1000000);
}
--------------------------------------------------------------end of
toksplit.cpp
If you are going to eliminate output for comparison, you should comment
out the entire last for loop as the C version outputs inline.
Also, to make things more equal, remove the vector, as this is only used
to store tokens for output.
--
Ian Collins.
.
|
|
|
| User: "" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 08:05:47 PM |
|
|
Ian Collins wrote:
If you are going to eliminate output for comparison, you should comment
out the entire last for loop as the C version outputs inline.
Also, to make things more equal, remove the vector, as this is only used
to store tokens for output.
FWIW Below is my version of the comparison. Moving the construction of
the stringstream into the loop really kills performance of the
stringstream version. However this is IMO a more realistic *simple*
useage . I also modified the other code into C++ style but thats by the
way. With this approach the C code is an order of magnitude faster ( I
had to decrease the number of loops to avoid waiting on the
stringstream code), but its not really a fair comparison. The killer of
the C version for me is that you cant have arbitrary length tokens. You
are limited to whatever the value of ABRsize is. If the C coders want
to write a version that can handle arbitrary length C style strings
then it would be a fairer comparison IMO, (though my previous comments
re ease of coding, testing etc remain) BTW I used boost timer for
timing. If you havent got the boost distro you will just have to modify
those parts. I'm too lazy to do that...
regards
Andy Little
#include <sstream>
#include <string>
#include <vector>
#include <iostream>
#include <boost/timer.hpp>
int const ABRsize = 64;
int const NLOOPS = 100000;
const char *
toksplit(
const char *src,
char tokchar,
char *token,
size_t lgh
);
int main()
{
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
std::cout << "Timing stringstream version: ";
boost::timer t0;
for( int count = 0; count < NLOOPS; ++count) {
std::stringstream ss;
ss << tst;
while (! ss.eof() ){
std::string str;
getline(ss,str,'\n');
}
}
std::cout << t0.elapsed() << "s\n";
std::cout << "Timing toksplit version: ";
boost::timer t1;
for( int count =0;count < NLOOPS;++count){
char token[ABRsize + 1];
const char *t = tst;
while (*t) {
t = toksplit(t, '\n', token, ABRsize);
}
}
std::cout << t1.elapsed() << "s\n";
}
const char *toksplit(
const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) *src++;
while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
}
*token = '\0';
return src;
} /* toksplit */
.
|
|
|
|
|
| User: "" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 07:47:17 PM |
|
|
jacob navia wrote:
Command line for non optimized version:
cl /EHsc toksplit.cpp
(Assuming this is my original as above)
Using these switches comes out at 120 kb on my system
lc toksplit.c
Command line for the optimized version:
cl /Ox /EHsc toksplit.cpp
Using these switches, comes out at 112 kb on my system
regards
Andy Little
.
|
|
|
| User: "jacob navia" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
07 May 2006 03:33:33 PM |
|
|
a écrit :
jacob navia wrote:
Command line for non optimized version:
cl /EHsc toksplit.cpp
(Assuming this is my original as above)
Using these switches comes out at 120 kb on my system
Yes, I am using the 64 bit compiler under windows server 2003 64 bits.
The native code is 64 bits too.
lc toksplit.c
Command line for the optimized version:
cl /Ox /EHsc toksplit.cpp
Using these switches, comes out at 112 kb on my system
Yes, in 32 bits its smaller. Still, nothing like 15K...
regards
Andy Little
.
|
|
|
| User: "" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
08 May 2006 09:00:39 PM |
|
|
jacob navia wrote:
andy@servocomm.freeserve.co.uk a =E9crit :
jacob navia wrote:
Command line for non optimized version:
cl /EHsc toksplit.cpp
(Assuming this is my original as above)
Using these switches comes out at 120 kb on my system
Yes, I am using the 64 bit compiler under windows server 2003 64 bits.
The native code is 64 bits too.
lc toksplit.c
Command line for the optimized version:
cl /Ox /EHsc toksplit.cpp
Using these switches, comes out at 112 kb on my system
Yes, in 32 bits its smaller. Still, nothing like 15K...
Its a shame that compiler cant handle C++ code else it might be more
interesting to me.
I'd really like to see what the C version for arbitrary length strings
would look like though.
regards
Andy Little
.
|
|
|
|
|
|
|
|
|
| User: "Christopher Benson-Manica" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 06:09:45 PM |
|
|
In comp.lang.c wrote:
OTOH By using C++ life becomes more productive, less error prone,
less complicated and more elegant:
Not always... (digression warning)
std::cout << "token: \""<< *iter <<"\"\n";
IMHO this is harder for the programmer to read than
printf( "token: \"%s\"\n", str );
To a certain extent this is a question of religion, but the difference
between the prevailing styles becomes more pronounced with heavily
formatted output:
printf( "%6s %2.2f %-18s:%u\n", val1, val2, val3, val4 );
Accomplishing the same thing with std::cout would be messy.
--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
.
|
|
|
| User: "" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 08:14:43 PM |
|
|
Christopher Benson-Manica wrote:
In comp.lang.c wrote:
OTOH By using C++ life becomes more productive, less error prone,
less complicated and more elegant:
Not always... (digression warning)
std::cout << "token: \""<< *iter <<"\"\n";
IMHO this is harder for the programmer to read than
printf( "token: \"%s\"\n", str );
In this case I think it depends what you are used to!
To a certain extent this is a question of religion, but the difference
between the prevailing styles becomes more pronounced with heavily
formatted output:
printf( "%6s %2.2f %-18s:%u\n", val1, val2, val3, val4 );
Accomplishing the same thing with std::cout would be messy.
FWIW I think it might look like this:
std::cout
<< std::setw(6) << val1
<< ' ' << std::fixed << std::setw(2)
<< std::setprecision(2) << val2
<< ' ' << std::left << std::setw(18) << val3
<< ":" << val4 << '\n';
regards
Andy Little
.
|
|
|
| User: "Ian Collins" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 08:32:32 PM |
|
|
wrote:
Christopher Benson-Manica wrote:
To a certain extent this is a question of religion, but the difference
between the prevailing styles becomes more pronounced with heavily
formatted output:
printf( "%6s %2.2f %-18s:%u\n", val1, val2, val3, val4 );
Accomplishing the same thing with std::cout would be messy.
FWIW I think it might look like this:
std::cout
<< std::setw(6) << val1
<< ' ' << std::fixed << std::setw(2)
<< std::setprecision(2) << val2
<< ' ' << std::left << std::setw(18) << val3
<< ":" << val4 << '\n';
Which I'm sure you will admit, is a bit of an abomination!
Thank goodness C++ retains the C standard library for cases like this.
--
Ian Collins.
.
|
|
|
| User: "Phlip" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 09:11:33 PM |
|
|
Ian Collins wrote:
std::cout
<< std::setw(6) << val1
<< ' ' << std::fixed << std::setw(2)
<< std::setprecision(2) << val2
<< ' ' << std::left << std::setw(18) << val3
<< ":" << val4 << '\n';
Which I'm sure you will admit, is a bit of an abomination!
Thank goodness C++ retains the C standard library for cases like this.
Thank goodness C++ permits you to write custom IO manipulators, to bottle
all that up into something legible.
Try extending printf to accept your own % tags some time...
--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
.
|
|
|
| User: "Malcolm" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
07 May 2006 02:03:33 PM |
|
|
"Phlip" <phlipcpp@yahoo.com> wrote
Ian Collins wrote:
std::cout
<< std::setw(6) << val1
<< ' ' << std::fixed << std::setw(2)
<< std::setprecision(2) << val2
<< ' ' << std::left << std::setw(18) << val3
<< ":" << val4 << '\n';
Which I'm sure you will admit, is a bit of an abomination!
Thank goodness C++ retains the C standard library for cases like this.
Thank goodness C++ permits you to write custom IO manipulators, to bottle
all that up into something legible.
Try extending printf to accept your own % tags some time...
It can't be done. Probably a good thing, because functions should always
behave the same way. If a standards committee decided, however, it would be
trivial to add an addprintftag() function.
/*
add a custom field to printf
Parmas: fieldcode - the letter we are using for the field (existing codes
can be overwritten).
format - function to perform formatting. Returns a pointer to static data.
field - the field the user entered (eg " %[myfancyspecifier]d"
obj - pointer user passed to printf.
*/
void addprintftag( char fieldcode, const char (*format)(const char *field,
const void *obj))
The slight nuisance is that the user must always pass extended arguments by
address, because the printf() variable argument code needs to know what
objects to get off the stack.
What you can do, of course, is write your own vsprintf_extendable()
function. Then you have to pass the results to an output function.
--
Visit my webpage www.personal.leeds.ac.uk/bgy1mm
Programming goodies.
.
|
|
|
|
| User: "Ian Collins" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 09:16:42 PM |
|
|
Phlip wrote:
Ian Collins wrote:
std::cout
<< std::setw(6) << val1
<< ' ' << std::fixed << std::setw(2)
<< std::setprecision(2) << val2
<< ' ' << std::left << std::setw(18) << val3
<< ":" << val4 << '\n';
Which I'm sure you will admit, is a bit of an abomination!
Thank goodness C++ retains the C standard library for cases like this.
Thank goodness C++ permits you to write custom IO manipulators, to bottle
all that up into something legible.
Thank goodness the C standard library saves you the trouble of writing
custom IO manipulators, to bottle all that up into something legible.
--
Ian Collins.
.
|
|
|
| User: "Phlip" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 09:25:17 PM |
|
|
Ian Collins wrote:
Thank goodness the C standard library saves you the trouble of writing
custom IO manipulators, to bottle all that up into something legible.
%*.*e
--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
.
|
|
|
|
| User: "Richard Herring" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
09 May 2006 09:32:02 AM |
|
|
In message <4c53kaF137jiaU5@individual.net>, Ian Collins
<ian-news@hotmail.com> writes
Phlip wrote:
Ian Collins wrote:
std::cout
<< std::setw(6) << val1
<< ' ' << std::fixed << std::setw(2)
<< std::setprecision(2) << val2
<< ' ' << std::left << std::setw(18) << val3
<< ":" << val4 << '\n';
Which I'm sure you will admit, is a bit of an abomination!
Thank goodness C++ retains the C standard library for cases like this.
Thank goodness C++ permits you to write custom IO manipulators, to bottle
all that up into something legible.
Thank goodness the C standard library saves you the trouble of writing
custom IO manipulators, to bottle all that up into something legible.
If val1, val2, val3, val4 are so intimately related, surely they should
have been encapsulated into an appropriate class or struct in the first
place.
Then the above reduces to
MyClass x;
std::cout << x;
Isn't that abominable?
--
Richard Herring
.
|
|
|
|
|
|
|
|
|
|
|
|
| User: "Phlip" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 09:17:31 AM |
|
|
Geometer wrote:
please can somebody tell me, what the standard behavior of strtok shall
be, if it encounters two or more consecutive delimiters like in
(checks omitted)
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
^^^^ ^^^^^^
char *tok = strtok(tst, "\n");
tok = strtok(NULL, "\n");
and so on..
will the groups of '\n' marked above be consumed one by one or the whole
group together?
Yes.
But why didn't you just write a test case and see?
Going forward, don't use strtok(). Google for a replacement, possibly
including a Regex system. Then you can control such details.
--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
.
|
|
|
| User: "Geometer" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 09:25:08 AM |
|
|
--
Geometer
Dipl.Ing. Erwin Lebloch
Hauptplatz 39
2130 Mistelbach - NÖ
Tel.: 02572/4300
www.lebloch.at
geometer@lebloch.at
"Phlip" <phlipcpp@yahoo.com> schrieb im Newsbeitrag
news:%d27g.27531$NS6.15820@newssvr30.news.prodigy.com...
Geometer wrote:
please can somebody tell me, what the standard behavior of strtok shall
be, if it encounters two or more consecutive delimiters like in
(checks omitted)
char tst[] = "this\nis\n\nan\nempty\n\n\nline";
^^^^ ^^^^^^
char *tok = strtok(tst, "\n");
tok = strtok(NULL, "\n");
and so on..
will the groups of '\n' marked above be consumed one by one or the whole
group together?
Yes.
But why didn't you just write a test case and see?
I did :). I just wanted to know if this is the behavior required by the
standard and whether there is a difference betwenn C and C++.
Thanks for your response.
Robert
.
|
|
|
| User: "Phlip" |
|
| Title: Re: strtok behavior with multiple consecutive delimiters |
06 May 2006 07:31:50 PM |
|
|
Geometer wrote:
I did :). I just wanted to know if this is the behavior required by the
standard and whether there is a difference betwenn C and C++.
Can we add to the FAQ "Please don't ask about strtok(), because everyone
here is ready to complain about it in endless ways"?
--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
.
|
|
|
|
|
|

|
Related Articles |
|
|