Author Topic: Latest String Class  (Read 9981 times)

0 Members and 1 Guest are viewing this topic.

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Latest String Class
« on: February 15, 2013, 11:08:54 PM »
Here would be Strings.h

Code: [Select]
//Strings.h
#if !defined(STRINGS_H)
#define STRINGS_H
#define EXPANSION_FACTOR      2
#define MINIMUM_ALLOCATION   16

class String
{
 public:
 friend String operator+(TCHAR*, String&);
 String();                                    //Uninitialized Constructor
 String(const TCHAR);                         //Constructor Initializes With A TCHAR.
 String(const TCHAR*);                        //Constructor Initializes String With TCHAR*
 String(const String&);                       //Constructor Initializes String With Another String (Copy Constructor)
 String(const int, bool);                     //Constructor Creates String With User Specified Capacity and optionally nulls out
 String(const int, const TCHAR);              //Constructor initializes String with int # of TCHARs
 String(int);                                 //Constructor initializes String with int converted to String
 String(unsigned int);                        //Constructor initializes String with unsigned int converted to String
 String(double);                              //Constructor initializes String with double converted to String
 String& operator=(const TCHAR);              //Assign A TCHAR To A String
 String& operator=(const TCHAR*);             //Assign A Null Terminated TCHARacter Array To A String
 String& operator=(const String&);            //Assigns Another String To this One
 String& operator=(int iNum);                 //Assigns an unsigned int to a String
 String& operator=(unsigned int iNum);        //Assigns an unsigned int to a String
 String& operator=(double dblNum);            //Assign a double to a String
 String operator+(const TCHAR);               //For adding TCHAR to String
 String operator+(const TCHAR*);              //Adds a TCHAR* to this
 String operator+(String&);                   //Adds another String to this
 String& operator+=(const TCHAR ch);          //Add TCHAR to this
 String& operator+=(const String&);           //Adds a String to this and assigns it to left of equal sign
 String& operator+=(const TCHAR*);            //Adds a TCHAR*to this and assigns it to left of equal sign
 bool operator==(String&);                    //Compares Strings For Case Sensitive Equality
 bool operator==(const TCHAR*);               //Compares String Against TCHAR* For Case Sensitive Equality
 String Allocate(int);                        //Allocates a String with a specified buffer size
 String& Make(const TCHAR ch, int iCount);    //Returns reference to this with iCount ch TCHARs in it
 String Left(int);                            //Returns String of iNum Left Most TTCHARs of this
 String Right(int);                           //Returns String of iNum Right Most TTCHARs of this
 String Mid(int, int);                        //Returns String consisting of number of TTCHARs from some offset
 String Replace(TCHAR*, TCHAR*);              //Returns String with 1st TCHAR* parameter replaced with 2nd TCHAR* parameter
 String Remove(TCHAR*);                       //Returns A String With All The TCHARs In A TCHAR* Removed (Individual TCHAR removal)
 String Remove(const TCHAR*, bool);           //Returns a String with 1st parameter removed.  2nd is bool for case sensitivity.
 int InStr(const TCHAR*, bool, bool);         //Returns one based offset of a particular TCHAR pStr in a String, starting from left or right (left=true)
 int InStr(const String&, bool, bool);        //Returns one based offset of where a particular String is in another String, starting from left or right (left=true)
 int ParseCount(const TCHAR);                 //Returns count of Strings delimited by a TTCHAR passed as a parameter
 void Parse(String*, TCHAR);                  //Returns array of Strings in first parameter as delimited by 2nd TTCHAR delimiter
 void SetTCHAR(int, TCHAR);                   //Sets TCHAR at zero based offset in this
 void LTrim();                                //Returns String with leading spaces/tabs removed
 void RTrim();                                //Returns String with spaces/tabs removed from end
 void Trim();                                 //Returns String with both leading and trailing whitespace removed
 int iVal();                                  //Returns integral value of String
 int Len();                                   //Returns Length Of String Controlled By this
 int Capacity();                              //Returns Maximum Permissable TCHARacter Count (One Less Than Allocation).
 TCHAR* lpStr();                              //Returns TCHAR* To String
 void Print(TCHAR*, bool);                    //Outputs String To Console With Or Without CrLf.
 ~String();                                   //String Destructor

 private:
 TCHAR* lpBuffer;
 int   iLen;
 int   iCapacity;
};

String operator+(TCHAR* lhs, String& rhs);
#endif  //#if !defined(STRINGS_H)

Note we're caching a pointer to the string buffer, the length of the string, and the capacity.  The capacity is simply how big the buffer was allocated so that a compare can be made to see if it needs to be expanded for whatever operation is taking place.
« Last Edit: February 18, 2013, 01:07:36 AM by Frederick J. Harris »

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Latest String Class
« Reply #1 on: February 15, 2013, 11:10:18 PM »
Now I'll try the Strings.cpp file ...

Code: [Select]
//Strings.cpp
#define UNICODE
#define _UNICODE
#include  <stdlib.h>
#include  <cstdio>
#include  <tchar.h>
#include  <math.h>
#include  <string.h>
#include  "Strings.h"


String operator+(TCHAR* lhs, String& rhs)         //global function
{
 String sr=lhs;
 sr=sr+rhs;

 return sr;
}


String::String()
{
 lpBuffer=new TCHAR[MINIMUM_ALLOCATION];
 lpBuffer[0]=_T('\0');
 this->iCapacity=MINIMUM_ALLOCATION-1;
 this->iLen=0;
}


String::String(const TCHAR ch)  //Constructor: Initializes with TCHAR
{
 this->iLen=1;
 int iNewSize=MINIMUM_ALLOCATION;
 this->lpBuffer=new TCHAR[iNewSize];
 this->iCapacity=iNewSize-1;
 this->lpBuffer[0]=ch, this->lpBuffer[1]=_T('\0');
}


String::String(const TCHAR* pStr)  //Constructor: Initializes with TCHAR*
{
 this->iLen=_tcslen(pStr);
 int iNewSize=(this->iLen/16+1)*16;
 this->lpBuffer=new TCHAR[iNewSize];
 this->iCapacity=iNewSize-1;
 _tcscpy(lpBuffer,pStr);
}


String::String(const String& s)  //Constructor Initializes With Another String, i.e., Copy Constructor
{
 int iNewSize=(s.iLen/16+1)*16;
 this->iLen=s.iLen;
 this->lpBuffer=new TCHAR[iNewSize];
 this->iCapacity=iNewSize-1;
 _tcscpy(this->lpBuffer,s.lpBuffer);
}


String::String(const int iSize, bool blnFillNulls)  //Constructor Creates String With Custom Sized
{                                                   //Buffer (rounded up to paragraph boundary)
 int iNewSize=(iSize/16+1)*16;
 this->lpBuffer=new TCHAR[iNewSize];
 this->iCapacity=iNewSize-1;
 this->iLen=0;
 this->lpBuffer[0]=_T('\0');
 if(blnFillNulls)
 {
    for(int i=0; i<this->iCapacity; i++)
        this->lpBuffer[i]=0;
 }
}


String::String(int iCount, const TCHAR ch)
{
 int iNewSize=(iCount/16+1)*16;
 this->lpBuffer=new TCHAR[iNewSize];
 this->iCapacity=iNewSize-1;
 for(int i=0; i<iCount; i++)
     this->lpBuffer[i]=ch;
 this->lpBuffer[iCount]=_T('\0');
 this->iLen=iCount;
}


String::String(int iNum)
{
 this->lpBuffer=new TCHAR[16];
 this->iCapacity=15;
 this->iLen=_stprintf(this->lpBuffer,_T("%d"),iNum);
}


String::String(unsigned int iNum)
{
 this->lpBuffer=new TCHAR[16];
 this->iCapacity=15;
 this->iLen=_stprintf(this->lpBuffer,_T("%u"),iNum);
}


String::String(double dblNum)
{
 this->lpBuffer=new TCHAR[32];
 this->iCapacity=31;
 this->iLen=_stprintf(this->lpBuffer,_T("%10.14f"),dblNum);
}


String& String::operator=(double dblNum)
{
 if(this->iCapacity<32)
 {
    delete [] this->lpBuffer;
    lpBuffer=new TCHAR[32];
    this->iCapacity=31;
 }
 this->iLen=_stprintf(this->lpBuffer,_T("%10.14f"),dblNum);

 return *this;
}


void String::SetTCHAR(int iOffset, TCHAR ch)   //zero based!
{
 if(iOffset<this->iCapacity)
 {
    this->lpBuffer[iOffset]=ch;
    if(ch==_T('\0'))
    {
       if(iOffset<this->iLen || this->iLen==0)
          this->iLen=iOffset;
    }
 }
}


String& String::operator=(const TCHAR ch)
{
 this->lpBuffer[0]=ch, this->lpBuffer[1]=_T('\0');
 this->iLen=1;
 return *this;
}


String& String::operator=(const TCHAR* pStr)
{
 int iNewLen=_tcslen(pStr);
 if(iNewLen>this->iCapacity)
 {
    delete [] this->lpBuffer;
    int iNewSize=(iNewLen*EXPANSION_FACTOR/16+1)*16;
    this->lpBuffer=new TCHAR[iNewSize];
    this->iCapacity=iNewSize-1;
 }
 _tcscpy(this->lpBuffer,pStr);
 this->iLen=iNewLen;

 return *this;
}


String& String::operator=(const String& strAnother)
{
 if(this==&strAnother)
    return *this;
 if(strAnother.iLen>this->iCapacity)
 {
    delete [] this->lpBuffer;
    int iNewSize=(strAnother.iLen*EXPANSION_FACTOR/16+1)*16;
    this->lpBuffer=new TCHAR[iNewSize];
    this->iCapacity=iNewSize-1;
 }
 _tcscpy(this->lpBuffer,strAnother.lpBuffer);
 this->iLen=strAnother.iLen;

 return *this;
}


String& String::operator=(int iNum)
{
 if(this->iCapacity>=15)
    this->iLen=_stprintf(this->lpBuffer,_T("%d"),iNum);
 else
 {
    delete [] this->lpBuffer;
    this->lpBuffer=new TCHAR[16];
    this->iCapacity=15;
    this->iLen=_stprintf(this->lpBuffer,_T("%d"),iNum);
 }

 return *this;
}


String& String::operator=(unsigned int iNum)
{
  if(this->iCapacity>=15)
    this->iLen=_stprintf(this->lpBuffer,_T("%d"),iNum);
 else
 {
    delete [] this->lpBuffer;
    this->lpBuffer=new TCHAR[16];
    this->iCapacity=15;
    this->iLen=_stprintf(this->lpBuffer,_T("%d"),iNum);
 }

 return *this;
}


String String::operator+(const TCHAR ch)
{
 int iNewLen=this->iLen+1;

 String s(iNewLen,false);
 _tcscpy(s.lpBuffer,this->lpBuffer);
 s.lpBuffer[iNewLen-1]=ch;
 s.lpBuffer[iNewLen]=_T('\0');
 s.iLen=iNewLen;

 return s;
}


String String::operator+(const TCHAR* pStr)
{
 int iNewLen=_tcslen(pStr)+this->iLen;
 String s(iNewLen,false);
 _tcscpy(s.lpBuffer,this->lpBuffer);
 _tcscat(s.lpBuffer,pStr);
 s.iLen=iNewLen;

 return s;
}


String String::operator+(String& strRef)
{
 int iNewLen=strRef.iLen+this->iLen;
 String s(iNewLen,false);
 _tcscpy(s.lpBuffer,this->lpBuffer);
 _tcscat(s.lpBuffer,strRef.lpBuffer);
 s.iLen=iNewLen;

 return s;
}


String& String::operator+=(const TCHAR ch)
{
 int iTot=this->iLen+1;
 if(iTot>this->iCapacity)
 {
    int iNewSize=(iTot*EXPANSION_FACTOR/16+1)*16;
    TCHAR* pNew=new TCHAR[iNewSize];
    _tcscpy(pNew,this->lpBuffer);
    delete [] this->lpBuffer;
    this->lpBuffer=pNew;
    this->lpBuffer[iTot-1]=ch;
    this->lpBuffer[iTot]=_T('\0');
    this->iCapacity=iNewSize-1;
    this->iLen=iTot;
 }
 else
 {
    this->lpBuffer[iTot-1]=ch;
    this->lpBuffer[iTot]=_T('\0');
    this->iLen=iTot;
 }
 return *this;
}


String& String::operator+=(const TCHAR* pStr)
{
 int iStrlen=_tcslen(pStr);
 int iTot=iStrlen+this->iLen;
 if(iTot>this->iCapacity)
 {
    int iNewSize=(iTot*EXPANSION_FACTOR/16+1)*16;
    TCHAR* pNew=new TCHAR[iNewSize];
    _tcscpy(pNew,this->lpBuffer);
    delete [] this->lpBuffer;
    this->lpBuffer=pNew;
    _tcscat(pNew,pStr);
    this->iCapacity=iNewSize-1;
    this->iLen=iTot;
 }
 else
 {
    _tcscat(this->lpBuffer,pStr);
    this->iLen=iTot;
 }
 return *this;
}


String& String::operator+=(const String& strRef)
{
 int iTot=strRef.iLen+this->iLen;
 if(iTot>this->iCapacity)
 {
    int iNewSize=(iTot*EXPANSION_FACTOR/16+1)*16;
    TCHAR* pNew=new TCHAR[iNewSize];
    _tcscpy(pNew,this->lpBuffer);
    delete [] this->lpBuffer;
    this->lpBuffer=pNew;
    _tcscat(pNew,strRef.lpBuffer);
    this->iCapacity=iNewSize-1;
    this->iLen=iTot;
 }
 else
 {
    _tcscat(this->lpBuffer,strRef.lpBuffer);
    this->iLen=iTot;
 }
 return *this;
}


bool String::operator==(String& strRef)
{
 if(_tcscmp(this->lpStr(),strRef.lpStr())==0)
    return true;
 else
    return false;
}


bool String::operator==(const TCHAR* pStr)
{
 if(_tcscmp(this->lpStr(),pStr)==0)
    return true;
 else
    return false;
}


String String::Allocate(int iCount)
{
 if(iCount>this->iCapacity)
 {
    delete [] lpBuffer;
    int iNewSize=(iCount*EXPANSION_FACTOR/16+1)*16;
    this->lpBuffer=new TCHAR[iNewSize];
    this->iCapacity=iNewSize-1;
 }
 this->lpBuffer[0]=_T('\0');
 this->iLen=0;

 return *this;
}


String& String::Make(const TCHAR ch, int iCount)    //Creates (Makes) a String with iCount TCHARs
{
 if(iCount>this->iCapacity)
 {
    delete [] lpBuffer;
    int iNewSize=(iCount*EXPANSION_FACTOR/16+1)*16;
    this->lpBuffer=new TCHAR[iNewSize];
    this->iCapacity=iNewSize-1;
 }
 for(int i=0; i<iCount; i++)
     this->lpBuffer[i]=ch;
 this->lpBuffer[iCount]=0;
 this->iLen=iCount;
 return *this;
}


String String::Left(int iNum)   //  strncpy = _tcsncpy
{
 if(iNum<this->iLen)
 {
    int iNewSize=(iNum*EXPANSION_FACTOR/16+1)*16;
    String sr(iNewSize,false);
    _tcsncpy(sr.lpBuffer,this->lpBuffer,iNum);
    sr.lpBuffer[iNum]=0;
    sr.iLen=iNum;
    return sr;
 }
 else
 {
    String sr=*this;
    return sr;
 }
}


String String::Right(int iNum)  //Returns Right$(strMain,iNum)
{
 if(iNum<this->iLen)
 {
    int iNewSize=(iNum*EXPANSION_FACTOR/16+1)*16;
    String sr(iNewSize,false);
    _tcsncpy(sr.lpBuffer,this->lpBuffer+this->iLen-iNum,iNum);
    sr.lpBuffer[iNum]=_T('\0');
    sr.iLen=iNum;
    return sr;
 }
 else
 {
    String sr=*this;
    sr.iLen=this->iLen;
    return sr;
 }
}


String String::Mid(int iStart, int iCount)
{
 if(iStart<1)
 {
    String sr;
    return sr;
 }
 if(iCount+iStart>this->iLen)
    iCount=this->iLen-iStart+1;
 String sr(iCount,false);
 _tcsncpy(sr.lpBuffer,this->lpBuffer+iStart-1,iCount);
 sr.lpBuffer[iCount]=_T('\0');
 sr.iLen=iCount;

 return sr;
}


String String::Replace(TCHAR* pMatch, TCHAR* pNew)  //strncmp = _tcsncmp
{
 int i,iLenMatch,iLenNew,iCountMatches,iExtra,iExtraLengthNeeded,iAllocation,iCtr;
 iLenMatch=_tcslen(pMatch);
 iCountMatches=0, iAllocation=0, iCtr=0;
 iLenNew=_tcslen(pNew);
 if(iLenNew==0)
 {
    String sr=this->Remove(pMatch,true); //return
    return sr;
 }
 else
 {
    iExtra=iLenNew-iLenMatch;
    for(i=0; i<this->iLen; i++)
    {
        if(_tcsncmp(lpBuffer+i,pMatch,iLenMatch)==0)
           iCountMatches++;  //Count how many match strings
    }
    iExtraLengthNeeded=iCountMatches*iExtra;
    iAllocation=this->iLen+iExtraLengthNeeded;
    String sr(iAllocation,false);
    for(i=0; i<this->iLen; i++)
    {
        if(_tcsncmp(this->lpBuffer+i,pMatch,iLenMatch)==0)
        {
           _tcscpy(sr.lpBuffer+iCtr,pNew);
           iCtr+=iLenNew;
           i+=iLenMatch-1;
        }
        else
        {
           sr.lpBuffer[iCtr]=this->lpBuffer[i];
           iCtr++;
        }
        sr.lpBuffer[iCtr]=_T('\0');
    }
    sr.iLen=iCtr;
    return sr;
 }
}


String String::Remove(TCHAR* pStr)
{
 unsigned int i,j,iStrLen,iParamLen;
 TCHAR *pThis, *pThat, *p;
 bool blnFoundBadTCHAR;

 iStrLen=this->iLen;               //The length of this
 String sr((int)iStrLen,false);    //Create new String big enough to contain original String (this)
 iParamLen=_tcslen(pStr);          //Get length of parameter (pStr) which contains TCHARs to be removed
 pThis=this->lpBuffer;
 p=sr.lpStr();
 for(i=0; i<iStrLen; i++)
 {
     pThat=pStr;
     blnFoundBadTCHAR=false;
     for(j=0; j<iParamLen; j++)
     {
         if(*pThis==*pThat)
         {
            blnFoundBadTCHAR=true;
            break;
         }
         pThat++;
     }
     if(!blnFoundBadTCHAR)
     {
        *p=*pThis;
         p++;
        *p=_T('\0');
     }
     pThis++;
 }
 sr.iLen=_tcslen(sr.lpStr());

 return sr;
}


String String::Remove(const TCHAR* pMatch, bool blnCaseSensitive)
{
 int i,iCountMatches=0,iCtr=0;

 int iLenMatch=_tcslen(pMatch);
 for(i=0; i<this->iLen; i++)
 {
     if(blnCaseSensitive)
     {
        if(_tcsncmp(lpBuffer+i,pMatch,iLenMatch)==0)  //_tcsncmp
           iCountMatches++;
     }
     else
     {
        if(_tcsnicmp(lpBuffer+i,pMatch,iLenMatch)==0) //__tcsnicmp
           iCountMatches++;
     }
 }
 int iAllocation=this->iLen-(iCountMatches*iLenMatch);
 String sr(iAllocation,false);
 for(i=0; i<this->iLen; i++)
 {
     if(blnCaseSensitive)
     {
        if(_tcsncmp(this->lpBuffer+i,pMatch,iLenMatch)==0)
           i+=iLenMatch-1;
        else
        {
           sr.lpBuffer[iCtr]=this->lpBuffer[i];
           iCtr++;
        }
        sr.lpBuffer[iCtr]=_T('\0');
     }
     else
     {
        if(_tcsnicmp(this->lpBuffer+i,pMatch,iLenMatch)==0)
           i+=iLenMatch-1;
        else
        {
           sr.lpBuffer[iCtr]=this->lpBuffer[i];
           iCtr++;
        }
        sr.lpBuffer[iCtr]=_T('\0');
     }
 }
 sr.iLen=iCtr;
 return sr;
}


int String::ParseCount(const TCHAR c)  //returns one more than # of
{                                      //delimiters so it accurately
 int iCtr=0;                           //reflects # of strings delimited
 TCHAR* p;                             //by delimiter.

 p=this->lpBuffer;
 while(*p)
 {
  if(*p==c)
     iCtr++;
  p++;
 }

 return ++iCtr;
}


void String::Parse(String* pStr, TCHAR delimiter)
{
 unsigned int i=0;
 TCHAR* pBuffer=0;
 TCHAR* c;
 TCHAR* p;

 pBuffer=new TCHAR[this->iLen+1];
 if(pBuffer)
 {
    pBuffer[0]=0, p=pBuffer;
    c=this->lpBuffer;
    while(*c)
    {
       if(*c==delimiter)
       {
          pStr[i]=pBuffer,  p=pBuffer;
          i++,              pBuffer[0]=0;
       }
       else
       {
          *p=*c,  p++;
          *p=0;
       }
       c++;
    }
    pStr[i]=pBuffer;
    delete [] pBuffer;
 }
}


int iMatch(TCHAR* pThis, const TCHAR* pStr, bool blnCaseSensitive, bool blnStartBeginning, int i, int iParamLen)
{
 if(blnCaseSensitive)
 {
    if(_tcsncmp(pThis+i,pStr,iParamLen)==0)   //_tcsncmp
       return i+1;
 }
 else
 {
    if(_tcsnicmp(pThis+i,pStr,iParamLen)==0)  //__tcsnicmp
       return i+1;
 }

 return 0;
}


int String::InStr(const TCHAR* pStr, bool blnCaseSensitive, bool blnStartBeginning)
{
 int i,iParamLen,iRange,iReturn;

 if(*pStr==0)
    return 0;
 iParamLen=_tcslen(pStr);
 iRange=this->iLen-iParamLen;
 if(blnStartBeginning)
 {
    if(iRange>=0)
    {
       for(i=0; i<=iRange; i++)
       {
           iReturn=iMatch(this->lpBuffer,pStr,blnCaseSensitive,blnStartBeginning,i,iParamLen);
           if(iReturn)
              return iReturn;
       }
    }
 }
 else
 {
    if(iRange>=0)
    {
       for(i=iRange; i>=0; i--)
       {
           iReturn=iMatch(this->lpBuffer,pStr,blnCaseSensitive,blnStartBeginning,i,iParamLen);
           if(iReturn)
              return iReturn;
       }
    }
 }

 return 0;
}


int String::InStr(const String& s, bool blnCaseSensitive, bool blnStartBeginning)
{
 int i,iParamLen,iRange,iReturn;

 if(s.iLen==0)
    return 0;
 iParamLen=s.iLen;
 iRange=this->iLen-iParamLen;
 if(blnStartBeginning)
 {
    if(iRange>=0)
    {
       for(i=0; i<=iRange; i++)
       {
           iReturn=iMatch(this->lpBuffer,s.lpBuffer,blnCaseSensitive,blnStartBeginning,i,iParamLen);
           if(iReturn)
              return iReturn;
           /*
           if(blnCaseSensitive)
           {
              if(_tcsncmp(this->lpBuffer+i,s.lpBuffer,iParamLen)==0)  //_tcsncmp
                 return i+1;
           }
           else
           {
              if(_tcsnicmp(this->lpBuffer+i,s.lpBuffer,iParamLen)==0) //__tcsnicmp
                 return i+1;
           }
           */
       }
    }
 }
 else
 {
    if(iRange>=0)
    {
       for(i=iRange; i>=0; i--)
       {
           iReturn=iMatch(this->lpBuffer,s.lpBuffer,blnCaseSensitive,blnStartBeginning,i,iParamLen);
           if(iReturn)
              return iReturn;
           /*
           if(blnCaseSensitive)
           {
              if(_tcsncmp(lpBuffer+i,s.lpBuffer,iParamLen)==0)   //_tcsncmp
                 return i+1;
           }
           else
           {
              if(_tcsnicmp(lpBuffer+i,s.lpBuffer,iParamLen)==0)  //__tcsnicmp
                 return i+1;
           }
           */
       }
    }
 }

 return 0;
}


void String::LTrim()
{
 int iCt=0;

 for(int i=0; i<this->iLen; i++)
 {
     if(this->lpBuffer[i]==32 || this->lpBuffer[i]==9)
        iCt++;
     else
        break;
 }
 if(iCt)
 {
    for(int i=iCt; i<=this->iLen; i++)
        this->lpBuffer[i-iCt]=this->lpBuffer[i];
 }
 this->iLen=this->iLen-iCt;
}


void String::RTrim()
{
 int iCt=0;

 for(int i=this->iLen-1; i>0; i--)
 {
     if(this->lpBuffer[i]==9||this->lpBuffer[i]==10||this->lpBuffer[i]==13||this->lpBuffer[i]==32)
        iCt++;
     else
        break;
 }
 this->lpBuffer[this->iLen-iCt]=0;
 this->iLen=this->iLen-iCt;
}


void String::Trim()
{
 this->LTrim();
 this->RTrim();
}


int String::iVal()
{
 return _ttoi(this->lpBuffer);  //_ttoi
}


int String::Len(void)
{
 return this->iLen;
}


int String::Capacity(void)
{
 return this->iCapacity;
}


TCHAR* String::lpStr()
{
 return lpBuffer;
}


void String::Print(TCHAR* pStr, bool blnCrLf)
{
 _tprintf(_T("%s%s"),pStr,lpBuffer);
 if(blnCrLf)
    _tprintf(_T("\n"));
}


String::~String()   //String Destructor
{
 delete [] lpBuffer;
 lpBuffer=0;
}
« Last Edit: February 18, 2013, 01:08:55 AM by Frederick J. Harris »

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Latest String Class
« Reply #2 on: February 15, 2013, 11:17:30 PM »
One of the more interesting things to do with this is have a version with piles of debug output statements in it to see all the various calls that are made depending on what calls are happening. 

To test this, use the program I posted in your forum a couple weeks ago.  I tested that with the 64 bit mingw and it worked fine.  Why don't I just repost it?

Code: [Select]
//frmOutput.h
LRESULT CALLBACK frmOutput(HWND, unsigned int, WPARAM, LPARAM);

Code: [Select]
//frmOutput.cpp 

//All this file consists of is the Window Procedure for the frmOutput
//Window Class.  A window of class frmOutput is created whose sole purpose is to be
//printed to with TextOut().  All default window procedure processing is ok so the
//window procedure frmOutput doesn't have to do anything on its own other than exist
//and pass all messages received onto DefWindowProc().  It has to exist because its
//referenced in the Class Registration code in fnWndProc_OnCreate() where the frmOutput
//class is registered.
#include <windows.h>
#include "frmOutput.h"

LRESULT CALLBACK frmOutput(HWND hwnd, unsigned int msg, WPARAM wParam,LPARAM lParam)
{
 return (DefWindowProc(hwnd, msg, wParam, lParam));
}

Code: [Select]
//StrDemo.h
#ifndef StrDemo_h
#define StrDemo_h

#define IDC_CONCATENATION       1500
#define IDC_PARSE               1505
#define IDC_LEFT_RIGHT_MID      1510
#define IDC_INSTR               1515

#define dim(x) (sizeof(x) / sizeof(x[0]))

struct                          WindowsEventArguments
{
 HWND                           hWnd;
 WPARAM                         wParam;
 LPARAM                         lParam;
 HINSTANCE                      hIns;
};

typedef WindowsEventArguments*  lpWndEventArgs;

long fnWndProc_OnCreate         (lpWndEventArgs Wea);
long fnWndProc_OnCommand        (lpWndEventArgs Wea);
long fnWndProc_OnDestroy        (lpWndEventArgs Wea);

void btnConcatenation_OnClick   (void);
void btnParse_OnClick           (void);
void btnLeftRightMid_OnClick    (void);
void btnInStr_OnClick           (void);

struct EVENTHANDLER
{
 unsigned int                   iMsg;
 long                           (*fnPtr)(lpWndEventArgs);
};

const EVENTHANDLER EventHandler[]=
{
 {WM_CREATE,                    fnWndProc_OnCreate},
 {WM_COMMAND,                   fnWndProc_OnCommand},
 {WM_DESTROY,                   fnWndProc_OnDestroy}
};

#endif

Code: [Select]
// Main.cpp
// For Compiling With mingw - g++ -Wall main.cpp Strings.cpp frmOutput.cpp -ostrDemo.exe -mwindows -m64 -Os -s
// For Compiling With VC++ -  WIN32;NDEBUG;_WINDOWS;_CRT_SECURE_NO_WARNINGS;_CRT_NON_CONFORMING_SWPRINTFS
// Also For VC++ - comment out UNICODE and _UNICODE, as these are set in Preprocessor Definitions
//#define UNICODE
//#define _UNICODE
#include <windows.h>
#include <tchar.h>
#include <stdio.h>
#include "StrDemo.h"
#include "frmOutput.h"
#include "Strings.h"


long fnWndProc_OnCreate(lpWndEventArgs Wea)
{
 TCHAR szClassName[]=_T("frmOutput");
 WNDCLASSEX wc;

 Wea->hIns=((LPCREATESTRUCT)Wea->lParam)->hInstance;
 wc.lpszClassName=szClassName;                          wc.lpfnWndProc=frmOutput;
 wc.cbSize=sizeof (WNDCLASSEX);                         wc.style=CS_DBLCLKS;
 wc.hIcon=LoadIcon(NULL,IDI_APPLICATION);               wc.hInstance=Wea->hIns;
 wc.hIconSm=NULL;                                       wc.hCursor=LoadCursor(NULL,IDC_ARROW);
 wc.hbrBackground=(HBRUSH)GetStockObject(WHITE_BRUSH);  wc.cbWndExtra=0;
 wc.lpszMenuName=NULL;                                  wc.cbClsExtra=0;
 RegisterClassEx(&wc);
 CreateWindow(_T("button"),_T("String Concatenation"),WS_CHILD|WS_VISIBLE,45,20,250,25,Wea->hWnd,(HMENU)IDC_CONCATENATION,Wea->hIns,0);
 CreateWindow(_T("button"),_T("String Parsing And Trimming"),WS_CHILD|WS_VISIBLE,45,55,250,25,Wea->hWnd,(HMENU)IDC_PARSE,Wea->hIns,0);
 CreateWindow(_T("button"),_T("Good Old Left, Right, And Mid!"),WS_CHILD|WS_VISIBLE,45,90,250,25,Wea->hWnd,(HMENU)IDC_LEFT_RIGHT_MID,Wea->hIns,0);
 CreateWindow(_T("button"),_T("But Led's Not Forget InStr( )!"),WS_CHILD|WS_VISIBLE,45,125,250,25,Wea->hWnd,(HMENU)IDC_INSTR,Wea->hIns,0);

 return 0;
}


void btnConcatenation_OnClick(void)
{
 HWND hOutput;
 String s;
 int iY=0;
 HDC hDC;

 hOutput=CreateWindow(_T("frmOutput"),_T("String Class Minipulations In C++"),WS_OVERLAPPEDWINDOW,450,20,305,480,0,0,GetModuleHandle(0),0);
 ShowWindow(hOutput,SW_SHOWNORMAL);
 hDC=GetDC(hOutput);
 for(unsigned int i=65; i<91; i++)
 {
     s=s+i;                                   //65 Is Capital 'A'
     TextOut(hDC,0,iY,s.lpStr(),s.Len());     //TextOut() is the Windows Version Of Basic's 'Print'
     iY=iY+16;                                //A Spacing Of 16 Pixels Works Good For Most Normal Sized Fonts
 }
 ReleaseDC(hOutput,hDC);
}


void btnParse_OnClick(void)
{
 String* ptrStrings;
 unsigned int iCt;
 HWND hOutput;
 String s;
 int iY=0;
 HDC hDC;

 hOutput=CreateWindow(_T("frmOutput"),_T("String Class Minipulations In C++"),WS_OVERLAPPEDWINDOW,50,400,310,180,0,0,GetModuleHandle(0),0);
 ShowWindow(hOutput,SW_SHOWNORMAL);
 hDC=GetDC(hOutput);
 s=_T("Frederick,  Mary,  James,  Samuel,  Edward,  Richard,  Michael,  Joseph");  //Create comma delimited text string
 iCt=s.ParseCount(_T(','));                                                        //Count delimited substrings
 ptrStrings = new String[iCt];                                                     //Create array of String Pointers
 s.Parse(ptrStrings,_T(','));                                                      //Parse 's' based on comma delimiter
 for(unsigned int i=0; i<iCt; i++)
 {
     ptrStrings[i].LTrim();  //Comment This Out To See Effect.                     //There are some spaces in delimited strings so remove them
     TextOut(hDC,0,iY,ptrStrings[i].lpStr(),ptrStrings[i].Len());                  //Output/Print array of strings one at a time
     iY=iY+16;
 }
 delete [] ptrStrings;
 ReleaseDC(hOutput,hDC);
}


void btnLeftRightMid_OnClick(void)
{
 HWND hOutput;
 String s1,s2;
 HDC hDC;

 hOutput=CreateWindow(_T("frmOutput"),_T("String Class Minipulations In C++"),WS_OVERLAPPEDWINDOW,800,100,325,250,0,0,GetModuleHandle(0),0);
 ShowWindow(hOutput,SW_SHOWNORMAL);
 hDC=GetDC(hOutput);
 s1=_T("George Washington Carver");
 s2=s1.Left(6);
 TextOut(hDC,20,20,s1.lpStr(),s1.Len());
 TextOut(hDC,20,40,s2.lpStr(),s2.Len());
 s2=s1.Mid(8,10);
 TextOut(hDC,20,60,s2.lpStr(),s2.Len());
 s2=s1.Right(6);
 TextOut(hDC,20,80,s2.lpStr(),s2.Len());
 ReleaseDC(hOutput,hDC);
}


void btnInStr_OnClick(void)
{
 String s1,s3;
 double dblPi;
 HWND hOutput;
 HDC hDC;

 hOutput=CreateWindow(_T("frmOutput"),_T("String Class Minipulations In C++"),WS_OVERLAPPEDWINDOW,550,575,525,200,0,0,GetModuleHandle(0),0);
 ShowWindow(hOutput,SW_SHOWNORMAL);
 hDC=GetDC(hOutput);
 s1=_T("InStr('Some Text') Locates The One Based Offset Of Where Something Is At.");
 TextOut(hDC,0,0,s1.lpStr(),s1.Len());
 s1=_T("C++ Can Be A Real Pain In The Butt To Learn!");
 TextOut(hDC,0,16,s1.lpStr(),s1.Len());
 String s2(s1.InStr(_T("real"),false,false));
 s3 =_T("The Offset Of 'real' In The Above String Is ");
 s3 = s3 + s2 + _T('.');
 TextOut(hDC,0,32,s3.lpStr(),s3.Len());
 s1=_T("Note In The Above Code We Used The Case Insensitive Version Of InStr().");
 TextOut(hDC,0,48,s1.lpStr(),s1.Len());
 s1=_T("Let's See If We Can Remove() 'Real' From Our 1st String!");
 TextOut(hDC,0,64,s1.lpStr(),s1.Len());
 s1=_T("C++ Can Be A Real Pain In The Butt To Learn!");
 TextOut(hDC,0,80,s1.lpStr(),s1.Len());
 s2=s1.Remove(_T("Real"),true);
 TextOut(hDC,0,96,s2.lpStr(),s2.Len());
 s1=_T("Dieses Program funktioniert aber sehr gut, nicht wahr?");
 TextOut(hDC,0,112,s1.lpStr(),s1.Len());
 dblPi=3.14159;
 s1=_T("dblPi = ");
 s2=dblPi;
 s1=s1+s2;
 TextOut(hDC,0,128,s1.lpStr(),s1.Len());
 ReleaseDC(hOutput,hDC);
}


long fnWndProc_OnCommand(lpWndEventArgs Wea)
{
 switch(LOWORD(Wea->wParam))
 {
  case IDC_CONCATENATION:
    btnConcatenation_OnClick();
    break;
  case IDC_PARSE:
    btnParse_OnClick();
    break;
  case IDC_LEFT_RIGHT_MID:
    btnLeftRightMid_OnClick();
    break;
  case IDC_INSTR:
    btnInStr_OnClick();
    break;
 }

 return 0;
}


long fnWndProc_OnDestroy(lpWndEventArgs Wea)
{
 PostQuitMessage(0);
 return 0;
}


LRESULT CALLBACK fnWndProc(HWND hwnd, unsigned int msg, WPARAM wParam,LPARAM lParam)
{
 WindowsEventArguments Wea;

 for(unsigned int i=0; i<dim(EventHandler); i++)
 {
     if(EventHandler[i].iMsg==msg)
     {
        Wea.hWnd=hwnd, Wea.lParam=lParam, Wea.wParam=wParam;
        return (*EventHandler[i].fnPtr)(&Wea);
     }
 }

 return (DefWindowProc(hwnd, msg, wParam, lParam));
}


int WINAPI WinMain(HINSTANCE hIns, HINSTANCE hPrevIns, LPSTR lpszArgument, int iShow)
{
 TCHAR szClassName[]=_T("StrDemo");
 WNDCLASSEX wc;
 MSG messages;
 HWND hWnd;

 wc.lpszClassName=szClassName;                wc.lpfnWndProc=fnWndProc;
 wc.cbSize=sizeof (WNDCLASSEX);               wc.style=0;
 wc.hIcon=LoadIcon(NULL,IDI_APPLICATION);     wc.hInstance=hIns;
 wc.hIconSm=0,                                wc.hCursor=LoadCursor(NULL,IDC_ARROW);
 wc.hbrBackground=(HBRUSH)COLOR_BTNSHADOW;    wc.cbWndExtra=0;
 wc.lpszMenuName=NULL;                        wc.cbClsExtra=0;
 RegisterClassEx(&wc);
 hWnd=CreateWindowEx(0,szClassName,_T("StrDemo"),WS_OVERLAPPEDWINDOW,20,20,350,210,HWND_DESKTOP,0,hIns,0);
 ShowWindow(hWnd,iShow);
 while(GetMessage(&messages,NULL,0,0))
 {
    TranslateMessage(&messages);
    DispatchMessage(&messages);
 }

 return (int)messages.wParam;
}
« Last Edit: February 18, 2013, 01:11:31 AM by Frederick J. Harris »

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Latest String Class
« Reply #3 on: February 15, 2013, 11:27:15 PM »
And here would be a simple version of that one I just posted that uses my string class instead of the C++ Std. Lib. one (I mean the one with the strGetText(HWND hCtl) thing that gets text out of a control) ...

Code: [Select]
//Main.cpp
#define  UNICODE
#define  _UNICODE
#include <windows.h>
#include <tchar.h>
#include "Strings.h"
#define  IDC_BUTTON 1600


String strGetText(HWND hCtrl)
{
 TCHAR szBuffer[4096];
 GetWindowText(hCtrl,szBuffer,4096);
 return String(szBuffer);
}


LRESULT CALLBACK fnWndProc(HWND hwnd, unsigned int msg, WPARAM wParam, LPARAM lParam)
{
 switch(msg)
 {
  case WM_CREATE:
    {
       HINSTANCE hIns=((LPCREATESTRUCT)lParam)->hInstance;
       CreateWindowEx(0,_T("button"),_T("Click Me!"),WS_CHILD|WS_VISIBLE,75,50,150,30,hwnd,(HMENU)IDC_BUTTON,hIns,0);
       return 0;
    }
  case WM_COMMAND:
    {
       if(LOWORD(wParam)==IDC_BUTTON)
       {
          HWND hMain=GetForegroundWindow();
          MessageBox(hwnd,strGetText(hMain).lpStr(),_T("Your Text..."),MB_OK);
       }
       return 0;
    }
  case WM_DESTROY:
    {
       PostQuitMessage(0);
       return 0;
    }
 }

 return (DefWindowProc(hwnd, msg, wParam, lParam));
}


int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevIns, LPSTR lpszArgument, int iShow)
{
 TCHAR szClassName[]=_T("Text Wrapper Demo");
 WNDCLASSEX wc={};
 MSG messages;
 HWND hWnd;

 wc.lpszClassName = szClassName;
 wc.lpfnWndProc   = fnWndProc;
 wc.cbSize        = sizeof(WNDCLASSEX);
 wc.hbrBackground = (HBRUSH)COLOR_BTNSHADOW;
 wc.hInstance     = hInstance;
 RegisterClassEx(&wc);
 hWnd=CreateWindowEx(0,szClassName,szClassName,WS_OVERLAPPEDWINDOW,200,175,320,200,HWND_DESKTOP,0,hInstance,0);
 ShowWindow(hWnd,iShow);
 while(GetMessage(&messages,NULL,0,0))
 {
    TranslateMessage(&messages);
    DispatchMessage(&messages);
 }

 return messages.wParam;
}

Using mine instead of the C++ Std. Lib. saved about 32K or so.  With my older version of mingw that comes with Code::Blocks 10.05 it came in 27K.  With the C++ Std. Lib. one it was 59K I think.  With Visual Studio set up to create the exe as a stand alone without needing the C++ Runtime Dll you'll likely see considerably bigger numbers, because MS doesn't try too hard to make things small.  Any trouble with this let me know.  I have VS to test it on.  Likely you'll need to stick a few equates in the preprocessor directives to quiet down the warnings about unsafe string functions, so on and so forth.  I'm a pretty unsafe kind of guy I guess.  Actually, I use C++ mostly on our handheld data collectors, so security isn't an issue for me.

With Visual C++ you'll get warnings about the #define UNICODE stuff.  That's because those defines are fed into the compiler directly by VS itself in the preprocessor directives, I believe.  You can just comment them out.
« Last Edit: February 15, 2013, 11:31:18 PM by Frederick J. Harris »

Offline Charles Pegge

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 678
  • User-Rate: +27/-1
    • Charles Pegge
Re: Latest String Class
« Reply #4 on: February 16, 2013, 08:10:53 AM »
Hi Fred,

I'm interested in dynamic multiple string concatenation, and string garbage collection to get as close as possible to Basic string hadling.

CONCATENATION:

StringJoin(r,s1,s2,s3,s4...NULL)  // using ellipsis for unlimited number of params

GARBAGE COLLECTION

StringTempListDelete() // following expressions containing intermediate strings
StringLocalListDelete() // at end of procedures containing local strings
StringStaticListDelete() // at end of program containing global and static strings

Does this fit in with your string class?

Charles
« Last Edit: February 16, 2013, 12:39:53 PM by Charles Pegge »

Offline Patrice Terrier

  • ROMs
  • Hero Member
  • *****
  • Posts: 934
  • User-Rate: +62/-1
    • www.zapsolution.com
Re: Latest String Class
« Reply #5 on: February 16, 2013, 05:09:44 PM »
Fred,

Thank you!

I was just looking for a close replacement for INSTR/PARSE  :)

...
« Last Edit: February 16, 2013, 05:13:05 PM by Patrice Terrier »
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

Offline Patrice Terrier

  • ROMs
  • Hero Member
  • *****
  • Posts: 934
  • User-Rate: +62/-1
    • www.zapsolution.com
Re: Latest String Class
« Reply #6 on: February 16, 2013, 05:16:30 PM »
Added

Missing INSTR( from the end (using a minus value) )

what is the closest to use with wstring?

...
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

Offline Patrice Terrier

  • ROMs
  • Hero Member
  • *****
  • Posts: 934
  • User-Rate: +62/-1
    • www.zapsolution.com
Re: Latest String Class
« Reply #7 on: February 16, 2013, 06:32:46 PM »
I found that for wstring, the easy way is to use something like this sMainString.find_last_of(L".")

Now i have to figure the best way to mimic UCASE/LCASE with wstring  :-\
not obvious in case of unicode, and non-speaking of UTF8  :(

...
« Last Edit: February 16, 2013, 06:34:17 PM by Patrice Terrier »
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

Offline Patrice Terrier

  • ROMs
  • Hero Member
  • *****
  • Posts: 934
  • User-Rate: +62/-1
    • www.zapsolution.com
C++ wstring UCASE
« Reply #8 on: February 16, 2013, 07:33:39 PM »
#include <windows.h>   // Search along the path.
#include <iostream>
#include <string.h>
#include <algorithm>
using namespace std;

wstring UCASE(IN wstring sBuf) {
    std::transform(sBuf.begin(), sBuf.end(), sBuf.begin(), toupper);
    return sBuf;
}

wstring LCASE(IN wstring sBuf) {
    std::transform(sBuf.begin(), sBuf.end(), sBuf.begin(), tolower);
    return sBuf;
}
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

Offline Charles Pegge

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 678
  • User-Rate: +27/-1
    • Charles Pegge
Re: Latest String Class
« Reply #9 on: February 16, 2013, 07:52:31 PM »
Just to let you know what happens to Unicode case conversion beyond UTF8...

http://www.unicode.org/faq/casemap_charprop.html

ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Latest String Class
« Reply #10 on: February 16, 2013, 09:01:22 PM »
Quote
StringJoin(r,s1,s2,s3,s4...NULL)  // using ellipsis for unlimited number of params

I believe that sort or construct is the only possibility within the purely functional ( non - OOP ) paradigm Charles.  Of course the BASIC languages have always used the more natural construct ...

Code: [Select]
Local strFullName As String
strFullName = "Frederick" & "John" & "Harris"
   

...and until recently (PB9), PowerBASIC at least was a purely functional non - OOP language.  But I'd make the case that dynamic strings in basic of whatever dialect have always been 'objects'. 

The argument I'd make is as follows.  With simple numbers such as integers or floats the meaning of the '+' symbol and other such symbols such '<', '>', '<=", '>=', are obvious.  But what happens with more complicated objects such as three dimensional shapes and strings?  For a box we might define < or > or = in terms of volume, but it wouldn't have to be so for all usages of boxes.  With strings it seems obvious that concatenating them together would be the most useful interpretation of the '+' or '&' symbol.  But it wouldn't have to be in that case either - another possible interpretation might actually be adding their respective asci codes together.  Not very useful to be sure, but that's just interpretation.   So in that sense strings have always been objects - even in non OOP languages. 

I think the only possible natural use of strings is as objects is as in basic or in C++ where 'overloading' the '+' operator can be done.  Of course, in C++ the '&' symbol can't be used because that is the 'address of' operator.   So in terms of the string class I use I don't believe I'd be interested in adding any functionality to it such as you mentioned, simply because I prefer the OOP orientation of strings as objects that can be concatenated together with the '+' operator.

Having said that though I'm fully with you on the idea of performance, and I've had a great deal of difficulty getting high performance out of my string class on tests where massive buffers are created and millions of concatenations are done.   To make a very, very, very long story short, with my string class I haven't been able to come anywhere close to the speed of the string class in the C++ Standard Library.  On the other hand, I've been able to trounce the Std. Lib's string class on other operations such as Replace operations where the buffer needs to be expanded for each replacement.  Of course, if I wanted to badly enough, I could take a few months to study up on STL (which I hate) and then examine the Std Lib's string class to see how they accomplish what they do.  But I just haven't been able to see my way clear to making that investment of time to solve the issue.   

My work with strings is likely very similiar to what Patrice is working on now, such as parsing filenames and so forth.  For years and years and years I did that sort of thing with the C string primitives (strcpy, strcat, strlen, etc), and I don't have to tell you how that can eat you alive time wise.  I'll never forget the time I spent a whole afternoon parsing a string sent into a C dll through the caption parameter of a CreateWindow call where I was building a custom control grid for use in my data recorder programs using the C string primitives such as strtok.  It would have been about ten minutes work in PowerBASIC using Parse, but I didn't have a PowerBASIC compiler for Windows CE - only C and C++.

That's actually how I started working on the string class many years ago, to solve problems such as that, and at that distant time I don't know if the C++ Standards Committee had settled on a String Class or not.  There was one as part of the MFC dll, but I didn't want to have to link with the megabyte sized MFC dll just to use its string class.  That's why I started my own.  I also know this was an area many basic coders worked on who made the switch to C or C++.  No one could tolerate the atrocious string issue in those languages.  At some later point I know Microsoft moved the string class out of MFC, but by that time I had my own.

Where things stand for now with me is that I use my own string class mostly because I coded a lot of my favorite PowerBASIC keywords into it such as Parse, Left, Right, Mid, InStr, etc.  Not everything certainly, but quite a few.  Using it in C++ is not as good as the PowerBASIC string functions, but its light years better than what exists in C with the low level buffer minipulation string functions.  If I ever came upon a situation where really fast processing was needed involving millions of bytes, rather than using it I'd just drop down to C type stuff with the low level C functions, which, as an aside, beat the crap out of anything you can do with vectors, STL, or the Std. Lib. string class.  And this is true in PowerBASIC too.  C and PowerBASIC are essentially equivalent in terms of speed in my opinion where you're working directly in memory buffers with pointers moving bytes around.  Of course that's very slow coding, but if speed matters one could do it.

A note to Patrice:

I could code the InStr thing working the string backwards, I suppose, which would be especially helpful for stripping the file name or extension out of a path, for example, but I just never took the time to do it.  What weighs heavy on my mind - likely foolishly, is that every addition I make to my string class makes it bigger.  It has grown over the years, but is still small.  That's the beauty of having a real string variable type existing within a language such as PowerBASIC or C#, in that the code to make all this good stuff happen is within the compiler - not deposited in your exe whether you need it all or not.

 

Offline Charles Pegge

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 678
  • User-Rate: +27/-1
    • Charles Pegge
Re: Latest String Class
« Reply #11 on: February 17, 2013, 11:29:54 AM »
Hi Fred,

Yes I agree strings have some characteristics of objects, and if C had a built-in concept of strings, it would have saved billions of programmer man hours, and persuaded many more Basic users to join the ranks. Perhaps Basic would have become a dead language many years ago :)

Just to explain where I am coming from. one of my tasks is to emulate my RunTime library in C. The library is written almost entirely in Assembly code so I have to find ways of encoding it within the constraints of the C language. This is one of the major components of producing a C emitter for the back-end of my compiler. Most of the run time library consistst of the core string functions and auto-garbage collection. - in all about 12k of 32 bit binary and 16k for 64bit.

Currently, I take the view that any standard C libraries that mention the word 'String' should be excluded from further consideration, so we are only working with chars ,pointers, malloc and free.

Charles
« Last Edit: February 17, 2013, 11:31:42 AM by Charles Pegge »

Offline Patrice Terrier

  • ROMs
  • Hero Member
  • *****
  • Posts: 934
  • User-Rate: +62/-1
    • www.zapsolution.com
Basic wstring LTRIM, RTRIM, TRIM
« Reply #12 on: February 17, 2013, 12:29:14 PM »
Here is my basic version of LTRIM, RTRIM, TRIM to use with unicode wstring.
Like the PowerBASIC version, it is able to handle any single character, not just space.

wstring LTRIM(IN wstring sBuf, IN wstring sChar) {
    wstring sResult = sBuf;
    long nLength = sBuf.length();
    if (nLength && (sChar.length())) {
        sChar = sChar.substr(0, 1);
        long K = 0;
        while (K < nLength) {
            if (*sBuf.substr(K, 1).c_str() == *sChar.c_str()) {
                ++K; }
            else {
                break;
            }
        }
        sResult = sBuf.substr(K, nLength);
    }
    return sResult;
}

wstring RTRIM(IN wstring sBuf, IN wstring sChar) {
    wstring sResult = sBuf;
    long nLength = sBuf.length();
    if (nLength && (sChar.length())) {
        sChar = sChar.substr(0, 1);
        while (nLength > 0) {
            if (*sBuf.substr(nLength - 1, 1).c_str() == *sChar.c_str()) {
                --nLength; }
            else {
                break;
            }
        }
        sResult = sBuf.substr(0, nLength);
    }
    return sResult;
}

wstring TRIM(IN wstring sBuf, IN wstring sChar) {
    return LTRIM(RTRIM(sBuf, sChar), sChar);
}


It is up to you to add START_FROM and ANY (looping on sChar.length()).

...
« Last Edit: February 17, 2013, 12:32:57 PM by Patrice Terrier »
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

Offline Patrice Terrier

  • ROMs
  • Hero Member
  • *****
  • Posts: 934
  • User-Rate: +62/-1
    • www.zapsolution.com
Basic wstring LTRIM, RTRIM, TRIM (ANY version)
« Reply #13 on: February 17, 2013, 05:11:55 PM »
Here is the ANY version

wstring LTRIM(IN wstring sBuf, IN wstring sChar) {
    wstring sResult = sBuf;
    long nLength = sBuf.length();
    if (nLength && (sChar.length())) {
        //sChar = sChar.substr(0, 1);
        long K = 0;
        while (K < nLength) {
            // if (*sBuf.substr(K, 1).c_str() == *sChar.c_str()) {
            // This is the sChar ANY search version
            if (std::wstring::npos != sChar.find(sBuf.substr(K, 1))) {

                ++K; }
            else {
                break;
            }
        }
        sResult = sBuf.substr(K, nLength);
    }
    return sResult;
}

wstring RTRIM(IN wstring sBuf, IN wstring sChar) {
    wstring sResult = sBuf;
    long nLength = sBuf.length();
    if (nLength && (sChar.length())) {
        //sChar = sChar.substr(0, 1);
        while (nLength > 0) {
            // if (*sBuf.substr(nLength - 1, 1).c_str() == *sChar.c_str()) {
            // This is the sChar ANY search version
            if (std::wstring::npos != sChar.find(sBuf.substr(nLength - 1, 1))) {

                --nLength; }
            else {
                break;
            }
        }
        sResult = sBuf.substr(0, nLength);
    }
    return sResult;
}


Comment:
Code in red, is typical of the abscons C++ syntax i shall forget as soon i have been able to write the core code of the functions.  ???

...
« Last Edit: February 17, 2013, 05:18:19 PM by Patrice Terrier »
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Latest String Class
« Reply #14 on: February 18, 2013, 01:06:05 AM »
That’s an amazing project you are working on Charles.  I’m at the point of thinking that writing a compiler is about too much for one man nowadays.  There are a few significant ones, to be sure, such as Pellas C, Bob’s PowerBASIC, etc., but it looks to me like its hard to compete with something like Microsoft or all the folks who work on the GNU compilers.  I’m especially amazed that with your OxygenBasic that it allows so many different code styles and syntaxes to be used.  I would think that really hones your parsing skills!

I took Patrice’s comment to heart about my InStr not working backwards, and decided to update that.  Actually, doing that solves another problem for me, which was that I was thinking of adding a couple routines to it that specifically targeted file paths and file name strings with paths, such as a GetPath() method, GetExtension() method, GetFileName() method, etc.  However, by implementing a reverse string search in InStr(), I’d essentially have all that accomplished, with the added benefit of having a routine not limited to just file names and paths.  So I did that, and updated the Strings.cpp anf Strings.h files I originally posted.  I also added an String::Allocate(int iSize) method, which resizes an existing string to iSize.  The purpose of that is to create a string of a specific capacity when one knows beforehand how big it might grow to.  I originally had one of those as a constructor, but got rid of it when I decided to eliminate my String::CStr() methods, and in their place create constructors that converted a passed number to a string form.  I got that idea some time ago from one of Kraig Brocksmidt’s COM books where I saw he was doing that with his homemade string class.  So since I was doing that in my constructors when an integer was passed in, I had to create an Allocate method when I wanted that done.

One issue you might run across with the STL based string class in the Std. Lib. Patrice, is that there doesn’t appear to be any good way of finding a Replace() method, anything like PowerBASIC’s powerful Replace.  As you likely know, PowerBASIC’s Replace allows one to make substitutions in a String where the string can grow, which of course might require destruction of the initial string and creation of a new one somewhat bigger.  I never put much work in learning to use the STL string class, so I looked about to find something, and in one of Bruce Eckel’s “Thinking In C++” books I found a procedure he put together.  It looks like so in its wide string version …

Code: [Select]
wstring& ReplaceAll(wstring& context, const wstring& from, const wstring& to)
{
 size_t lookHere=0;
 size_t foundHere;

 while((foundHere=context.find(from,lookHere)) != wstring::npos)
 {
  context.replace(foundHere, from.size(), to);
  lookHere=foundHere+to.size();
 }

 return context;
}

Being as he’s a big programming author and C++ whiz; he used to be on the C++ Standards Committee, I figured it was likely a good function.  It works, but what I’ve found is that its performance is rather dismal.  Here’s a C++ console program that makes C++ jump through some hoops and tests out a lot of string processing functionality.  This was originally John Gleson’s test program for speed …

Code: [Select]
//#define  UNICODE  //not needed for MS VS
//#define  _UNICODE //not needed for MS VS
#include "windows.h"
#include <stdio.h>
#include <string>
#define  NUMBER 2000000
using namespace std;

wstring& ReplaceAll(wstring& context, const wstring& from, const wstring& to)
{
 size_t lookHere=0;
 size_t foundHere;

 while((foundHere=context.find(from,lookHere)) != wstring::npos)
 {
  context.replace(foundHere, from.size(), to);
  lookHere=foundHere+to.size();
 }

 return context;
}

int main(void)
{
 unsigned int t1=0,t2=0;
 register int iCtr=0;
 register int i=0;
 wstring s2;

 t1=GetTickCount(), t2=t1;
 wprintf(L"Starting....\n");
 wstring s1(NUMBER,L'-');
 t2=GetTickCount()-t2;
 wprintf(L"Finished Creating String With %u Of These - : milliseconds elapsed - %u\n",NUMBER,t2);
 t2=t1;
 for(i=0; i<NUMBER; i++, iCtr++)
 {
     if(iCtr==7)
     {
        s1[i]=L'P';
        iCtr=0;
     }
 }
 t2=GetTickCount()-t2;
 wprintf(L"Finished Inserting 'P's In s1!              :   milliseconds elapsed - %u\n",t2);
 t2=t1;
 ReplaceAll(s1,L"P",L"PU");
 t2=GetTickCount()-t2;
 wprintf(L"Finished Replacing 'P's With PU!            :   milliseconds elapsed - %u\n",t2);
 t2=t1;
 ReplaceAll(s1,L"-",L"8");
 t2=GetTickCount()-t2;
 wprintf(L"Finished Replacing '-'s With 8!             :   milliseconds elapsed - %u\n",t2);
 t2=t1;
 s2.reserve(2400000);
 wprintf(L"Now Going To Create Lines With CrLfs!\n");
 for(int i=0; i<NUMBER; i=i+90)
     s2+=s1.substr(i,90)+L"\r\n";
 t2=GetTickCount()-t2;
 wprintf(L"Finished Creating Lines!                    :   milliseconds elapsed - %u\n",t2);
 s1=s2.substr(s2.length()-4000,4000);
 t1=GetTickCount()-t1;
 wprintf(L"t1 = %u\n",(unsigned)t1);
 MessageBox(NULL,s1.c_str(),L"Here Is Your String John!",MB_OK);
 getchar();

 return 0;
}

What it does is create a 2 MB buffer.  Into that it inserts a ‘P’ every seventh character.  Then it replaces every ‘P’ with a “PU”, which of course causes the buffer to grow with each insert.  Then it goes through the buffer and replaces all the blanks with dashes, i.e., ‘-‘.  Next it inserts a CrLf every 90 characters to break it into lines, which, again, causes the buffer to grow.  Finally, it outputs the last 4000 characters to a MessageBox().  Its pretty much string torture, and as soon as it starts running your computer’s fan will likely come on, if its not already running!  These are the numbers I came up with running on an old and slow laptop …

Code: [Select]
/*
Starting....
Finished Creating String With 2000000 Of These - :   milliseconds elapsed -     16
Finished Inserting 'P's In s1!                   :   milliseconds elapsed -     31
Finished Replacing 'P's With PU!                 :   milliseconds elapsed - 662734
Finished Replacing '-'s With 8!                  :   milliseconds elapsed - 662891
Now Going To Create Lines With CrLfs!
Finished Creating Lines!                         :   milliseconds elapsed - 662906
t1 = 662906
*/

As you can see, the bottleneck in the above program was the replacement of Ps with PUs.  That took up like 99.99% of the time.  What amazes me though is it only took 15 ticks to do the concatenation for loop!!!  I find that unimaginable.  Especially if you take a look at that same program with my String class …

Code: [Select]
//#define  UNICODE
//#define  _UNICODE
#include "Windows.h"
#include "tchar.h"
#include <stdio.h>
#include "Strings.h"

enum                                              // Exercise
{                                                 // =======================================
 NUMBER         = 2000000,                        // 1)Create a 2MB string of dashes
 LINE_LENGTH    = 90,                             // 2)Change every 7th dash to a "P"
 RIGHT_BLOCK    = 4000,                           // 3)replace every "P" with a "PU" (hehehe)
 NUM_PS         = NUMBER/7+1,                     // 4)replace every dash with an "8"
 PU_EXT_LENGTH  = NUMBER+NUM_PS,                  // 5)Put in a CrLf every 90 characters
 NUM_FULL_LINES = PU_EXT_LENGTH/LINE_LENGTH,      // 6)Output last 4K to Message Box
 MAX_MEM        = PU_EXT_LENGTH+NUM_FULL_LINES*2
};

int main()
{
 String  CrLf(_T("\r\n"));
 register int iCtr=0;
 register int i=0;
 String s1,s2;
 DWORD t1,t2;

 t1=GetTickCount(), t2=t1;
 puts("Processing...");
 s1.Allocate(MAX_MEM);
 t1=GetTickCount()-t1;
 printf("After Allocating s1(MAX_MEM)...         : %u\n",(unsigned)t1);
 t1=t2;
 s1.Make('-',NUMBER);
 t1=GetTickCount()-t1;
 printf("After Making 2,000,000 dashes...        : %u\n",(unsigned)t1);
 t1=t2;
 s2.Allocate(MAX_MEM);
 for(i=0; i<NUMBER; i++, iCtr++)
 {
     if(iCtr==7)
     {
        s1.SetTCHAR(i,_T('P'));
        iCtr=0;
     }
 }
 t1=GetTickCount()-t1;
 printf("After Assigning Ps Every 7th Char...    : %u\n",(unsigned)t1);
 t1=t2;
 s2=s1.Replace((TCHAR*)_T("P"),(TCHAR*)_T("PU"));
 t1=GetTickCount()-t1;
 printf("After Doing John's PU Thing...          : %u\n",(unsigned)t1);
 t1=t2;
 s1=s2.Replace((TCHAR*)_T("-"),(TCHAR*)_T("8"));
 t1=GetTickCount()-t1;
 printf("After Replacing Dashes With 8s...       : %u\n",(unsigned)t1);
 t1=t2;
 s2.SetTCHAR(0,_T('\0'));
 for(i=0; i<PU_EXT_LENGTH; i+=LINE_LENGTH)
     s2+=s1.Mid(i+1,LINE_LENGTH)+CrLf;
 t1=GetTickCount()-t1;
 printf("After Big Concatenation With CrLfs...   : %u\n",(unsigned)t1);
 t1=t2;
 s1=s2.Right(RIGHT_BLOCK);
 t1=GetTickCount()-t1;
 printf("Finished!                               : %u\n",(unsigned)t1);
 MessageBox(0,s1.lpStr(),_T("Here's Your String John!"),MB_OK);
 getchar();

 return 0;
}

/*
Processing...
After Allocating s1(MAX_MEM)...         : 0
After Making 2,000,000 dashes...        : 15
After Assigning Ps Every 7th Char...    : 31
After Doing John's PU Thing...          : 265
After Replacing Dashes With 8s...       : 406
After Big Concatenation With CrLfs...   : 45234
Finished!                               : 45234
*/


Mine ran in about 45 seconds, instead of the 11 minutes for the C++ Std. Lib. String class, but those numbers don’t come anywhere close to telling the full story.  Note that the first couple numbers are almost exactly the same. Then when my string class hit the replacement of Ps with PUs, it ran through there in about 230 ticks instead of the 11 minutes in the first program.  But then look what happened to my string class when it hit the for loop concatenation with CrLfs!  It took my string class almost 45000 ticks to do what the Std. Lib. String class did in 15 ticks!!! 

I’ve tried everything I know how to do to get my concatenation numbers down – short of buying a pile of STL books and taking months to learn it, but to no avail.  I don’t know how its doing that in 15 ticks.  The tentative conclusion I’ve come to is as follows. 

Its not really doing the algorithm!

I suspect that the compiler is somehow optimizing the string handling through the class interface completely away, and simply doing low level byte blasting with either the asm byte blitting string primitives, or there close counterparts in the C Std. Lib., i.e., strcpy(), memcopy(), etc.

Using those myself I can easily blow away the kind of numbers I’m showing above in either program, and get that algorithm to run nearly instantaneously in just a few ticks.  But its pretty ugly mean code too!

Its further my belief that those poor numbers from the C++ Std. Lib’s string class aren’t specifically an inherent problem with the class, but rather that poor ReplaceAll() function of Bruce Eckel’s.  So if you are in the mode Patrice of using the C++ Std. Lib’s string class, and coming up with your own versions of basic string members for it, maybe you’ll want to come up with a better version than Bruce’s shown above, either one you write yourself, or one you got elsewhere.  If you want to see a real ‘hot’ C++ coders (not me) implementation of that algorithm I can post it for you, but it isn’t much prettier than my C versions of it, and not quite as fast either. I just thought you might be interested in this stuff, as you are now working with it, and its something that I put a lot of effort into myself in the past as I was becoming more conversant with C++.