Author Topic: Minimize Program Size By Eliminating The C Runtime Library  (Read 19946 times)

0 Members and 1 Guest are viewing this topic.

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Using TCLib.lib And String Class For x86/x64/ascii/unicode Application Development

     TCLib is a replacement for the Microsoft C Runtime Library whose purpose is to allow the creation of applications consisting of very small executable files.  Console mode or GUI windows can be created with executable sizes starting in the 2 or 3 kilobyte range.  This feat is achieved by not linking against the C Runtime Library, statically or dynamically.  This can be done because the modular design of the C and C++ languages separate core language features from input/output activities and Runtime Library support.   

     My String Class consisting of two files named Strings.h and Strings.cpp is a replacement for the C++ Standard Library String Class, but it also contains some of the functionality of the C++ iostream Class.  It is desirable to use in conjunction with the code I’ll provide to eliminate the C Runtime because the manipulation of C style character arrays is a very awkward way of dealing with strings in the context of high level application development work.  My String Class works with TCLib.lib but the C++ Standard Library String Class, iostreams, and in fact everything in the C++ Standard Library won’t.  It’ll be off limits to you so I want to point that out up front.  If achieving small non-bloated programs is important to you, the above C++ Libraries must be avoided at all costs, as it is the design and construction of those libraries which causes inflated binaries which we term as ‘bloatware’.   

     First off, if you don’t have TCLib.lib and want to create it yourself from scratch, or you just want to know how to do it, you’ll need these *.cpp and *.h files which I’ve provided in the download and will also post in their entirety later…

Code: [Select]
crt_con_a.cpp       Process entry point for asci console programs     
crt_con_w.cpp       Process entry point for wide console programs
crt_win_a.cpp       Process entry point for asci GUI programs
crt_win_w.cpp       Process entry point for wide GUI programs
memset.cpp          Replacement for C Runtime memset()
memcpy.cpp          Replacement for C Runtime memcpy()
newdel.cpp          Implementation of C++ new and delete operators
printf.cpp          Replacement for C Runtime printf() – Thanks Matt Pietrek!
sprintf.cpp         Replacement for C Runtime srintf() – Thanks Matt Pietrek!
_strnicmp.cpp       My Replacement for C Runtime _strnicmp
strncpy.cpp         My Replacement for C Runtime strncpy
strncmp.cpp         My Replacement for C Runtime strncmp
_strrev.cpp         My Replacement for C Runtime _strrev
strcat.cpp          My Replacement for C Runtime strcat
strcmp.cpp          My Replacement for C Runtime strcmp
strcpy.cpp          My Replacement for C Runtime strcpy
strlen.cpp          My Replacement for C Runtime strlen
getchar.cpp         My Replacement for C Runtime getchar
alloc.cpp           Replacement for C Runtime memory allocation functions from Matt Pietrek
alloc2.cpp          Replacement for C Runtime memory allocation functions from Matt Pietrek
allocsup.cpp        Replacement for C Runtime memory allocation functions from Matt Pietrek
FltToCh.cpp         My replacement for floating point support only available in C Runtime
atol.cpp            My Replacement for C Runtime atol
_atoi64.cpp         My Replacement for C Runtime _atoi64
abs.cpp             My Replacement for C Runtime abs
win32_crt_math.cpp  (only used in x86) Big Thanks To Martins Mozeiko From Handmade Hero For This!

malloc.h
memory.h
stdio.h
stdlib.h
string.h
tchar.h

Running TCLib.mak (just below) with Microsoft’s nmake.exe utility will create the library.  Note towards the bottom is this line, which are the directions to lib.exe to create the lib…

Code: [Select]
LIB /NODEFAULTLIB /machine:x64 /OUT:$(PROJ).LIB $(OBJS)

For creating a 32 bit version of TCLib.lib, the /machine: specification should be x86 – not x64.  I suppose I haven’t pointed out that this code works for both x86 and x64, but if you want both which you possibly will you need to create separate directories for both, and you need to create and use two separate libs.  This file is for x64, as you can see by my machine specification of x64 as described above….

// TCLib.mak
Code: [Select]
PROJ       = TCLib

OBJS       = crt_con_a.obj crt_con_w.obj crt_win_a.obj crt_win_w.obj memset.obj newdel.obj printf.obj \
             sprintf.obj _strnicmp.obj strncpy.obj strncmp.obj _strrev.obj strcat.obj strcmp.obj \
             strcpy.obj strlen.obj getchar.obj alloc.obj alloc2.obj allocsup.obj FltToCh.obj atol.obj \
             atoi64.obj abs.obj memcpy.obj win32_crt_math.obj
       
CC         = CL
CC_OPTIONS = /D "_CRT_SECURE_NO_WARNINGS" /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN

$(PROJ).LIB: $(OBJS)
    LIB /NODEFAULTLIB /machine:x64 /OUT:$(PROJ).LIB $(OBJS)

.CPP.OBJ:
    $(CC) $(CC_OPTIONS) $<

If you aren’t familiar with doing these sorts of things, simply create a directory/folder somewhere, open a command prompt to that directory, put all of the above files (the *.cpp and *.h files) in that directory including TCLib.mak, and execute nmake on TCLib.mak as follows…

C:\SomeDirectory>nmake TCLib.mak [ENTER]

Note that the path to the Microsoft Compiler Toolchain must be properly set for the operating system to find nmake from any arbitrary folder on your computer.  For Microsoft products (which is all this code and work applies to, really), when you download the SDK or install Visual Studio, Microsoft puts shortcuts on your Start Menu to batch files which when executed set the paths correctly to the compiler toolchain.  On my computer with Visual Studio 2015 this is the target and command line switch of the shortcut I’m using for x64 compiles….

""C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat"" amd64

The ‘amd64’ command line argument to vcvarsall.bat tells this batch file to set the compiler to run in x64 mode.

     Having successfully created TCLib.lib as above, or just using the supplied one, lets start with a simple Dennis Ritchie Hello, World! Program.  Here is Demo1.cpp…

Code: [Select]
// cl Demo1.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 2,048 bytes x86 ASCII, 2,048 bytes x86 UNICODE VC15
// 3,072 bytes x64 ASCII, 3,072 bytes x64 UNICODE VC19
#define  UNICODE
#define  _UNICODE
#include <windows.h>
#include "stdio.h"
#include "tchar.h"

int _tmain()
{
 _tprintf(_T("Hello, World!\n"));
 getchar();

 return 0;
}

// Output:
// =============
// Hello, World!

     That compiles to various numbers between 2k and 3k depending on which version of the Microsoft compiler you are using, whether you are doing an ascii or unicode build, or whether you are doing x86 or x64.  It compares very favorably with even asm code as below which compiles to just 2.5 k…

Code: [Select]
; C:\Code\MASM\Projects\Demo1>ml /c /coff /Cp hello2.asm
; C:\Code\MASM\Projects\Demo1>link /SUBSYSTEM:CONSOLE hello2.obj
; 2,560 Bytes
.386
.model flat,stdcall
option casemap:none

include     \masm32\include\windows.inc
include     \masm32\include\kernel32.inc
include     \masm32\include\msvcrt.inc

includelib  \masm32\lib\kernel32.lib
includelib  \masm32\lib\msvcrt.lib

.data
msg         db 'Hello, World!',13,10,0

.code
start:
  invoke  crt_printf, ADDR msg
  invoke  crt_getchar
  invoke  ExitProcess, 0
END start
 

…although to be fair, I have to admit I am using the C Runtime printf for console output above in the asm code.  Getting back to Demo1.cpp, bear with me while I discuss that program in some detail.  This line…

Code: [Select]
cl Demo1.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib   

…invokes the Microsoft compiler cl.exe.  The ‘/O1’ and ‘Os’ switches tell the compiler to optimize for small code size and prevent debugging symbols from being put in the binary output.  The ‘/GS-‘ switch turns off security checks.  This must be done as some of the code for implementing that is in the C Standard Library which isn’t present to be linked against.  If it isn’t specified the code won’t link.

     The ‘/link TCLib.lib kernel32.lib’ tells the linker which libraries to look in first when resolving function or variable references found in the code.  It must not find in the code any function or variable references not found in either of those libraries, or it will load the C Standard Library and start looking there to resolve them.  When it does that it will encounter naming collisions with the substitute functions I have provided in TCLib.lib, and it will not know which code to use.  At that time linker errors will be generated and the executable won’t be produced and you’ll be looking at bizarre linker errors the likes of which you’ve never seen before and you’ll be SOL and very unhappy.  That’s why you can’t use anything from the C++ Standard Library such as its string class - std::string or std::wstring. 

     Note we need to include windows.h in all these programs because a substantial amount of the support which would ordinarily come from the C Runtime has to come from somewhere else if we eliminate it, and that somewhere else is in many cases the Windows libraries, which require us to include windows.h.  These lines…

Code: [Select]
#define UNICODE
#define _UNICODE

…tell the preprocessor we’re doing a wide character build.  That will greatly affect how these files are processed…

Code: [Select]
#include <windows.h>
#include "stdio.h"
#include "tchar.h"

     Note the top file uses ‘<>’ symbols to enclose a file name, and the bottom two file names are encased with quotes.  A small detail, seemingly, but this is important!  The brackets tell the preprocessor that the indicated file will be found in the system PATH as specified in the environment for include files.  Alternately, the enclosing quotes tell the preprocessor that the file is an application specific file found in the same directory with the application’s source code files.  Stdio.h and tchar.h are actually system include files, but I had to make up TCLib specific renditions of these files, which is unusual I admit, because various things encountered within those system files of the same name would cause the compiler or linker to throw all kinds of bizarre warnings and errors that will ruin your life trying to track down and fix.  Believe me, I know.  I have supplied my versions of those files for you, and you’ll see they are much simpler and smaller than the system files of the same name.  Bottom line – be very careful about how you include files in terms of ‘< >’ symbols or double quotes.

     Moving on, this symbol…

Code: [Select]
int _tmain()

…will resolve through the mysterious alchemy of tchar.h to…

Code: [Select]
wmain

…if UNICODE is defined (which it is), or to…

Code: [Select]
main

…if it isn’t.  Lets take a look at what the above program compiles to size wise if we use the C Runtime.  That will give us some idea of what we’re saving by eliminating it.  Here would be that…

Code: [Select]
// cl Hello.cpp /O1 /Os /MT    // Optimize for small code and stand alone build with no redistributables needed.
// 119,296 bytes x64 ASCII     // Things have gotten so bad we need 120 k for Hello, World!!!!!!
#include <stdio.h>             // We can and should use ‘< >’ symbols on stdio.h because we’re using the system file.

int main()
{
 printf("Hello, World!\n");
 getchar();

 return 0;
}

// Output:
// =============
// Hello, World!

     So we’re saving something like one hundred and sixteen to one hundred and seventeen thousand bytes if using Microsifts VC19 compiler.  The situation is the same with GUIs.  Here’s Form1.cpp 

Code: [Select]
// cl Form1.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib user32.lib
#define UNICODE        //  3,072 Bytes x64 UNICODE or ASCI With LibCTiny.lib
#define _UNICODE       // 84,992 Bytes With C Standard Library Loaded (LIBCMT.LIB)
#include <windows.h>
#include "tchar.h"

LRESULT CALLBACK fnWndProc(HWND hwnd, unsigned int msg, WPARAM wParam, LPARAM lParam)
{
 if(msg==WM_DESTROY)
 {
    PostQuitMessage(0);
    return 0;
 }

 return (DefWindowProc(hwnd, msg, wParam, lParam));
}

int WINAPI _tWinMain(HINSTANCE hInstance, HINSTANCE hPrevIns, LPTSTR lpszArgument, int iShow)
{
 WNDCLASSEX wc={0};
 MSG messages;
 HWND hWnd;

 wc.lpszClassName = _T("Form1");
 wc.lpfnWndProc   = fnWndProc;
 wc.cbSize        = sizeof(WNDCLASSEX);
 wc.hInstance     = hInstance;
 wc.hbrBackground = (HBRUSH)COLOR_BTNSHADOW;
 RegisterClassEx(&wc);
 hWnd=CreateWindowEx(0,_T("Form1"),_T("Form1"),WS_OVERLAPPEDWINDOW|WS_VISIBLE,200,100,325,300,HWND_DESKTOP,0,hInstance,0);
 while(GetMessage(&messages,NULL,0,0))
 {
    TranslateMessage(&messages);
    DispatchMessage(&messages);
 }

 return messages.wParam;
}

We’re at only 3 k with that in x64 wide character.  I believe with the older VC15 from Visual Studio 2008 running in x86 its 2,560 bytes.  The situation with pure asm is similar.  Here is the above in 32 bit masm, which also compiles to 2,560 bytes…

Code: [Select]
.386
.model     flat, stdcall
option     casemap:none
include    \masm32\include\windows.inc
include    \masm32\include\user32.inc
include    \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
WinMain proto hInst:HINSTANCE, hPrevInst:HINSTANCE, CmdLine:LPSTR, CmdShow:DWORD

.Data
ClassName   db "Form2", 0
AppName     db "Form2", 0

.Data?
hInstance   HINSTANCE ?
CommandLine LPSTR     ?

.Code
start:

invoke GetModuleHandle, NULL
mov    hInstance, eax
invoke GetCommandLine
mov    CommandLine, eax
invoke WinMain, hInstance, NULL, CommandLine, SW_SHOWDEFAULT
invoke ExitProcess, eax


WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
  .IF uMsg==WM_DESTROY
      invoke PostQuitMessage, NULL
  .ELSE
      invoke DefWindowProc, hWnd, uMsg, wParam, lParam
ret
  .ENDIF
  xor eax, eax
  ret
WndProc endp


WinMain proc hInst:HINSTANCE, hPrevInst:HINSTANCE, CmdLine:LPSTR, CmdShow:DWORD
  LOCAL wc  : WNDCLASSEX
  LOCAL msg : MSG
  LOCAL hwnd: HWND
 
  mov    wc.cbSize, SIZEOF WNDCLASSEX
  mov    wc.style, CS_HREDRAW or CS_VREDRAW
  mov    wc.lpfnWndProc, OFFSET WndProc
  mov    wc.cbClsExtra, NULL
  mov    wc.cbWndExtra, NULL
  push   hInstance
  pop    wc.hInstance
  mov    wc.hbrBackground, COLOR_WINDOW + 1
  mov    wc.lpszMenuName, NULL
  mov    wc.lpszClassName, OFFSET ClassName
  invoke LoadIcon,NULL, IDI_APPLICATION
  mov    wc.hIcon, eax
  mov    wc.hIconSm, eax
  invoke LoadCursor, NULL, IDC_ARROW
  mov    wc.hCursor, eax
  invoke RegisterClassEx, addr wc
  INVOKE CreateWindowEx,NULL,ADDR ClassName,ADDR AppName,WS_OVERLAPPEDWINDOW,100,100,375,350,NULL,NULL,hInst,NULL
  mov    hwnd, eax
  invoke ShowWindow, hwnd, SW_SHOWNORMAL
  invoke UpdateWindow, hwnd
  .WHILE TRUE
      invoke GetMessage, ADDR msg, NULL, 0, 0
      .BREAK .IF (!eax)
      invoke TranslateMessage, ADDR msg
      invoke DispatchMessage, ADDR msg
  .ENDW
  mov    eax, msg.wParam
  ret
WinMain endp
end start

     So the inevitable conclusion at which one must arrive is that it isn’t the C++ language which is bloated or produces bloatware; it’s the libraries you use which produces that effect.  Lets move on to Demo2.cpp which actually does something a little hard because we’re using floating point math.  That has been the most difficult part by far of developing this library, because all the support for manipulating floating point numbers is ultimately in the C Standard Library, which we’ve eliminated.

continued...

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #1 on: March 23, 2016, 01:19:44 AM »
Code: [Select]
// cl Demo2.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 4,096 bytes x86 ASCII, 4,096 bytes x86 UNICODE  VC15
// 5,120 bytes x64 ASCII, 5,632 bytes x64 UNICODE  VC19
#define UNICODE
#define _UNICODE
#include  <windows.h>
#include  "stdio.h"
#include  "tchar.h"
#define   x64
extern "C" int _fltused=1;

int _tmain()
{
 double dblNums[]={123456.78912, -456.9876, 9999.99999, -0.987654, 0.0, 1.985};
 TCHAR szBuffer[16];
 int iLenRet=0;

 _tprintf(_T("i       iLenRet   dblNums[i]\n"));
 _tprintf(_T("============================\n"));
 for(size_t i=0; i<6; i++)
 {
     iLenRet=FltToTch(szBuffer, dblNums[i], 16, 2, _T('.'),true);
     #ifdef x64
        _tprintf(_T("%-4Iu%8d%16s\n"),i,iLenRet,szBuffer);
     #else
        _tprintf(_T("%-4u%8d%16s\n"),i,iLenRet,szBuffer);
     #endif
 }
 getchar();

 return 0;
}


#if 0

i       iLenRet   dblNums[i]
============================
0         15       123456.79
1         15         -456.99
2         15        10000.00
3         15           -0.99
4         15            0.00
5         15            1.98

#endif

     Almost all of the past two months working on this project has been involved in one way or another with solving the floating point math problem.  As you can see above I have an array of six doubles of rather widely differing sizes.  I try to print them out to the console with right justified formatting and rounded to two decimal places.  I had to code that all myself because Windows relies on functionality in the C Standard Library to deal with floating point math.  Very surprisingly, the problem in doing this wasn’t x64.  All the problem was in x86 32 bit.  You’ll see a function above which I named FltToTch.  That’s actually a macro defined in my version of tchar.h for FltToCh and FltToWch.  Here are the function declarations.  Depending on whether UNICODE is defined or not, FltToTch resolves to either…

Code: [Select]
size_t __cdecl FltToCh(char* p, double x, size_t iFldWthTChrs, int iDecPlaces, char cDecimalSeperator, bool blnRightJustified);

…or this…

Code: [Select]
size_t __cdecl FltToWch(wchar_t* p, double x, size_t iFldWthTChrs, int iDecPlaces, wchar_t cDecimalSeperator, bool blnRightJustified);

     You’ll only need to use this function if you want to output floating point numbers.  For integral numbers, i.e., ints, longs, _int64s, or strings, or anything else, you can just use printf or sprintf and their tchar varieties.  They work as expected.  If doubles need to be output through use of the above function, you’ll need to provide a buffer where the output of the conversion is to be written.  In that sense, the function is a bit like sprintf.  The first parameter is that buffer…

Code: [Select]
size_t __cdecl FltToCh
(
  char* p,                   // Pointer to buffer of a size large enough to accommodate converted double
  double x,                  // The double to be converted
  size_t iFldWthTChrs,       // Should be the total size of 1st parameter buffer in TCHARs including NULL
  int iDecPlaces,            // Number of decimal places in desired output
  char cDecimalSeperator,    // Character used to separate whole number from decimal (‘.’ for U.S.)
  bool blnRightJustified     // Right/Left Justification In Buffer
);

     This function can be used without my String Class as is the case above in Demo2.cpp.  It is also used within my String Class in the various formatting members.  This function’s capabilities are what was missing from my previous posts on removing the C Runtime.  Through use of this function in x86 or x64 we can finally free ourselves from the C Runtime – at least in these demo programs.  As you can see in the program sizes we’re up to 4 or 5 k or so.  Lets take a look at that program using the C Runtime in the typical manner…

Code: [Select]
// cl Demo2.cpp /O1 /Os /MT
// 119,808 Bytes x64 ASCII VC19
#include  <stdio.h>
#define x64

int main()
{
 double dblNums[]={123456.78912, -456.9876, 9999.99999, -0.987654, 0.0, 1.985};
 
 printf("i     dblNums[i]\n");
 printf("================\n");
 for(size_t i=0; i<6; i++)
 {
     #ifdef x64
        printf("%-4Iu%12.2f\n",i,dblNums[i]);
     #else
        printf("%-4u%12.2f\n",idblNums[i]);
     #endif
 }
 getchar();

 return 0;
}

#if 0

i     dblNums[i]
================
0      123456.79
1        -456.99
2       10000.00
3          -0.99
4           0.00
5           1.99

#endif

     So instead of 4 or 5 k we’re at 120 k.  Something worth noting there that’s maybe a mistake on my part is what’s happening with dblNums[5] which is the 1.985 entry in the array.  I did that specifically to stress test my FltToTch algorithm.  I thought that in rounding numbers it was standard practice to round to the nearest even number when the digit place to be rounded is right in the middle, i.e., a ‘5’ digit.  So 1.985 rounded to two digits would cause the ‘5’ to be simply dropped, yielding 1.98.  Alternately, I would have thought 1.975 would be rounded up instead of down because 1.98 is the nearest even number.  At least, that’s how I coded my rounding algorithm in FltToCh.  I can easily change it if someone tells me its wrong.  As you can see, the C Runtime rounded to 1.99 instead of my 1.98.  All the other numbers seem to agree with my function, I believe. 

     The next thing I have to point out about Demo2.cpp is this line…

Code: [Select]
extern "C" int _fltused=1;

     That’s another duzy.  Lost a good many days on that one, and was finally pulled out of the depths of complete and utter defeat by Martins Mozeiko from the Handmade Hero gaming forum.  He’s another one of these incredible folks like Jose Roca here or Michael Mattias in the PowerBASIC Forums who seem to know everything there is to know.  It was Martins’ article on removing the C Runtime where I learned about _fltused.  If you attempt to so much as assign a floating point number to a double without the C Standard Library being loaded your screen will explode in more truly evil looking linker errors than you could ever imagine.  Perhaps at this point I might mention some other issues concerning floating point support, as, like I said, it was nearly a ‘stopper’ on this project, it was that difficult to deal with.

     To begin with, when I first started working on this project through Matt Pietrek’s original LibCTiny work which dates back all the way to the late 1990s, I wasn’t aware of the floating point issue as Matt never mentioned it.  He simply stated that his work might be useful in some small utility programs, but other more complex projects would likely need full use of the C and C++ Standard Libraries.  But he did have nice implementations of printf and sprintf, which are really key functions in the C/C++ coding universe, and I assumed they worked for all variable types – including floating point.  It wasn’t until I was well underway in this project with high hopes of seeing it to completion where I would have full support for both ascii and wide character in both x86 and x64 architectures when I discovered to my complete amazement and dismay that floating points wouldn’t work.  That’s where the _fltused thing enters the picture.  It appears to be some flag sent perhaps between the compiler and linker telling the linker that floating point routines are going to be needed.

     But inserting …

Code: [Select]
extern “C” int _fltused=1;

… into the source code didn’t solve the floating point issue.  All that did was enable a program to compile/link if a double was declared and initialized, such as like so…

Code: [Select]
double dblNumber = 3.14159;

It didn’t allow printf family functions to convert floating points to ascii/wide strings.  That’s where Raymond Filiatreault’s assembler based FpuLib.lib entered the picture.  But that solution only worked for x86 because Raymond didn’t provide an x64 version of the library.  Actually, I believe Raymond’s work on that was done long before x64 even entered the picture.  So in an effort to achieve a solution that would work for both x86 and x64 I coded my own conversion routines which are my FltToCh and FltToWch procedures we used above.  I started that coding work in an x64 environment simply because that was the functionality I lacked.  Since I was writing basic C code it never occurred to me it wouldn’t compile in x86.  I figurred when I finished and got it working in x64 I’d simply recompile in x86 and all would be peaches and cream and I’d have a common solution for both 32 bit ad 64 bit.  Well, that’s not how it worked out!  Not at all!

     Just the other day I ran into an article I hadn’t known about earlier in Dr. Dobb’s Journal by Matthew Wilson, February 1, 2003 entitled “Avoiding The Visual C++ Runtime Library”…

http://www.drdobbs.com/avoiding-the-visual-c-runtime-library/184416623

From him I learned that what I had just accomplished was all but impossible…     

Quote
64-Bit Integers and Floating Points

If you are using floating points in all their glory, then there is no choice but to use the CRT Library, because the complex supporting infrastructure functions and variables reside within it. However, many uses of floating-point numbers are in the fractional intermediate calculations of integral numbers. In these circumstances, it is often possible to emulate the calculation by some clever use of the Win32 MulDiv() function, which multiplies two 32-bit parameters into a 64-bit intermediate and then divides that by the third 32-bit parameter....
...
...
…The 64-bit integer type, __int64, has been a built-in type in the Visual C++ compiler since Version 2.0. Simple operations on the type, including bitwise and logical (Boolean) operations, and addition and subtraction can induce the compiler to place inline bit/byte-wise manipulation. However, the arithmetic operations multiply, divide, and modulo, and the shift operators are all implemented as calls to CRT Library functions (_allmul(), _alldiv(), _allrem(), _allshl(), and _allshr()). If you are using any of those operations, and cannot convert those operations to their 32-bit equivalents without losing accuracy, then you must accept linking to the CRT Library.       

     When I attempted to compile my x64 coded and working FltToCh / FltToWch functions in x86 I immediately got bombarded with more compilation and linker errors having to do with functions I never heard of, specifically _dtoui.  That stands for double to unsigned int.  It was precipitated by a casting operation in FltToCh, which involves casting a 64 bit double to a 32 bit unsigned int.  Here’s what it looks like.  The code below is stripping off the leading digits of a double precision number and indexing into the ascii table where ‘48’ starts off as zero, and the expression n = (size_t)x, where n is size_t and x is a double is what is doing it…

Code: [Select]
while(k<=17)
{
   if(k == i)
      k++;
   *(p1+k)=48+(char)n;
   x=x*10;
   n = (size_t)x;                //   <<<<<<<<<<<<<<<<<<<<< BANG!
   x = x-n;
   k++;
 }
 *(p1+k) = '\0';


It just never occurred to me that that was something that wouldn’t compile in x86 but would in x64!  When we are used to working above the assembler level in higher level languages it becomes easy to take things for granted and proceed onward oblivious of what is taking place in the registers.  And that wasn’t the only problem.  The next program - Demo3.cpp, which we havn’t gotten to yet, goes the other way.  In that program I’m converting this…

Code: [Select]
TCHAR szBuffer[]=_T("  -12345678987654");

… to an _int64.  That caused a call to CRT function _allmul() -  one of the functions in the crt Matthew Wilson just described above.  Right at that point I came closest as I’ve ever come to abandoning this whole project.  But then I thought of all the work I put into this, all the successes I had, and all the unbelievable hurdles I had overcame so far, and decided to plow foreward.  Afterall, I did now have an x64 solution to the floating point issue which I just coded and it worked well, and I had Raymond’s FpuLib solution in x86.  So I could hack my way through with that.  However, I did do an internet search on these issues and lo and behold where does that end up?  Back to Martins Mozeiko’s article is where.  At that point I decided it would behoove me to join that community and see if I could get to know Martins Mozeiko, nonwithstanding the fact that I’m far to much of a puritan to have ever played a computer game.  After telling him of my woes he kindly provided me with several options for getting through the _dtoui issue in FltToCh, one of which I was finally able to get to work (see the conditional compilation directives in FltToCh).  The other issue was the CRT functions called by various operations on floating point numbers such as the _allmul(), _alldiv(), allshl(), etc., functions mentioned by Matthew Wilson above.  In Visual Studio 2015 these can be found here in asm form…   

C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\crt\src\i386

In earlier versions of Visual Studio such as my VS 2008 version they are implemented here…

C:\Program Files\Microsoft Visual Studio 9.0\VC\crt\src\intel

Anyway, Martens provided me with his win32_crt_math.cpp file, which you might want to take some note of.  He converted/transferred some of the above assembler into C++ naked functions to take the place of the missing ones in the crt which we haven’t loaded.  And its part of the code running to make this all work in x86.  Just thought you ought to know. 

     Lets move on to Demo3.cpp (and maybe now you’ll have some appreciation of what’s going on behind the scenes to make it work), which shows going the other way, i.e., starting with a numeric character string in ascii or wide format and converting it to binary…

Code: [Select]
// cl Demo3.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 3,584 bytes;  34.57 times smaller than with C Std. Lib.
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdio.h"
#include "stdlib.h"
#include "tchar.h"

int main()
{
 TCHAR szBuffer[]=_T("  -12345678987654");
 _int64 iNum;
 
 iNum=_ttoi64(szBuffer);
 _tprintf(_T("iNum=%Id\n"), iNum);
 iNum=_abs64(iNum);
 _tprintf(_T("iNum=%Id\n"), iNum);
 getchar();

 return 0;
}
// Output:
// ==================
// iNum=-12345678987654
// iNum=12345678987654

// 12,345,678,987,654
// Twelve Trillion, Three Hundred and Fourty Five Billion, Six Hundred and Seventy Eight Million,
// Nine Hundred and Eighty Seven Thousand, Six Hundred and Fifty Four.

     We’re only looking at 3.5 k with that.  Note that I defined iNum as an _int64 instead of the size_t or ssize_t I’d typically use.  By doing that the program will provide identical results when compiled in x86 or x64.  Had I used ssize_t, which would be a 32 bit number in x86, the converted number would be too large to fit with an x86 build.  It would wrap around and give bogus results.  Now the fun starts.   Let’s move on to my String Class, which we’re first using in Demo4.cpp…

Code: [Select]
// cl Demo4.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 4,096 Bytes With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out   
// 4,608 Bytes With Full String Class
#define  UNICODE
#define  _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
//extern "C" int _fltused=1;

int main()
{
 String s1(_T("Hello, World!"));
 
 s1.Print(_T("s1         = "), true);
 _tprintf(_T("s1.lpStr() = %s\n"), s1.lpStr());
 getchar();
 
 return 0;
}

// Output:
// ==========================
// s1         = Hello, World!
// s1.lpStr() = Hello, World!

Note at top with the inclusion of my String Class which necessitates that Strings.cpp be added to the command line compilation string and Strings.h be added to the includes we’re seeing 4,096 bytes in the executable without capabilities for integral numeric conversions, floating point conversions, and formatting (I’ll describe those in a bit), and 4,608 bytes including those capabilities (which, however, aren’t needed in this program).  This is controlled by #defines in Strings.h.  All this program does, by the way, is show use of constructor notation to assign the string L“Hello, World!” to String s1, and output the string to the console two ways, i.e., by String::Print() and through the String::lpStr() member functions.  For comparison purposes here is the same exact program using the C++ Standard Library’s STL based std::wstring class…

Code: [Select]
// cl Test3.cpp /O1 /Os /GS- /EHsc kernel32.lib
// 136,192 bytes
#include <string>
#include <cstdio>
#include <tchar.h>

int main()
{
 std::wstring s1(L"Hello, World!");
 
 wprintf(L"s1.c_str() = %s\n", s1.c_str());
 getchar();
 
 return 0;
}

As you can see, we’re saving about 130,000 bytes here through use of my techniques.  So folks who aren’t members here can have access to this code without the download I’ll now provide the code.  First Strings.h…

continued.....

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #2 on: March 23, 2016, 01:22:23 AM »
Code: [Select]
// Strings.h                                                    // NULL Terminated String Class With Method Naming Conventions After BASIC Model Languages.  Implemented For
#ifndef Strings_h                                               // x86 / x64, Asci / Unicode (For x86 Builds Comment Out x64 #define Below Left). NOTE!  Assumes fixed width
#define Strings_h                                               // char/wchar_t 8/16 bit encoding.  Not suitable for Chinese, Japenese or Korean (CJK) encodings with surrigate
                                                                // bytes.
#ifndef ssize_t
typedef SSIZE_T ssize_t;                                        // ssize_t is defined in GCC, but isn't defined in VC 9-15, but rather SSIZE_T.  For symetry we'll define it.
#endif

#define MINIMUM_ALLOCATION                   16                 // allocate at least this for every String
#define EXPANSION_FACTOR                      2                 // repeated concatenations will keep doubling buffer
#define INTEGRAL_CONVERSIONS                                    // allows direct assignment of integral numeric values to Strings, e.g., String s(12345) or s=54321;
#define FLOATING_POINT_CONVERSIONS                              // allows direct assignment of floating point values to Strings, e.g., String s(3.14159) or s=2.98;
#define FORMATTING                                              // put commas every three places, rounding, left/right justification, specify field sizes (padding), etc.
#define CONSOLE_OUTPUT                                          // support console output, i.e., enable String::Print()
#define x64

class String
{
 public:                                                        // Constructors (10)
 String();                                                      // Uninitialized Constructor
 String(const int iSize, bool blnNullOut);                      // Constructor creates String of size iSize and optionally nulls out
 String(const TCHAR ch);                                        // Constructor creates a String initialized with a char, e.g., String c('A');
 String(const TCHAR* pStr);                                     // Constructor: Initializes with char*, e.g. s1 = "PowerBASIC! Compile Without Compromise!"
 String(const String& strAnother);                              // Constructor creates String initialized with another already existing String, e.g., String s2(s1);
 #ifdef INTEGRAL_CONVERSIONS
    String(int iNumber);                                        // Constructor creates String initialized with an int, e.g., String s1(2468); kind of Str$(2468) in PowerBASIC
    String(unsigned int uNumber);                               // Constructor creates String initialized with an unsigned int, e.g., String s1(2468); kind of Str$(2468) in PowerBASIC
    #ifdef x64
       String(size_t  uiNumber);                                // Constructor creates String from 64 bit unsigned number, e.g., String strBigNum(12345678987654);
       String(ssize_t iNumber);                                 // Constructor creates String from 64 bit signed number, e.g., String strBigNum(-12345678987654);
    #endif
 #endif
 #ifdef FLOATING_POINT_CONVERSIONS
    String(double dblNumber);                                   // Constructor creates String from floating point number, e.g., String s(3.14159);
 #endif
 String& operator=(const TCHAR c);                              // Assign a char to a String, e.g., String s='A';
 String& operator=(const TCHAR* pStr);                          // Assign a character string to a String Object, e.g.,  String s1 = "Compile Without Compromise";
 String& operator=(const String& strAnother);                   // Assign an already existing String to this one, e.g., String s2 = s1;
 #ifdef INTEGRAL_CONVERSIONS
    String& operator=(int iNumber);                             // Assign an int converted to a String to this, e.g., String s1 = -123456789;
    String& operator=(unsigned int uNumber);                    // Assign an unsigned int converted to a String to this, e.g., String s1 =  123456789;
    #ifdef x64
       String& operator=(size_t  uNumber);                      // Assign a 64 bit unsigned quantity converted to a String to this, e.g., String s2 =  12345678987654;
       String& operator=(ssize_t iNumber);                      // Assign a 64 bit   signed quantity converted to a String to this, e.g., String s2 = -12345678987654;
    #endif
 #endif
 #ifdef FLOATING_POINT_CONVERSIONS
    String& operator=(double dblNumber);                        // Assign a double converted to a String to this, e.g., String strDouble = 3.14159;
 #endif
 String operator+(const TCHAR ch);                              // Concatenates or adds a character to an already existing String, e.g., s1 = s1 + 'A';
 String operator+(const TCHAR* pChar);                          // Concatenates or adds a character array (char*) to an already existing String, e.g., s1 = s1 + lpText;
 String operator+(String& s);                                   // Concatenates or adds another String to this one, e.g., s1 = s1 + s2;
 bool operator==(String& s);                                    // Compares two Strings for case sensitive equality
 bool operator==(const TCHAR* pChar);                           // Compares a String against a char* for case sensitive equality
 bool operator!=(TCHAR* pChar);                                 // Compares a String against a char* for case sensitive inequality
 void LTrim();                                                  // Removes leading white space by modifying existing String
 void RTrim();                                                  // Removes trailing white space by modifying existing String
 void Trim();                                                   // Removes leading or trailing white space from existing String
 String Left(size_t iCntChars);                                 // Returns a String consisting of left-most iCntChars of this
 String Right(size_t iCntChars);                                // Returns a String consisting of right-most iCntChars of this
 String Mid(size_t iStart, size_t iCount);                      // Returns a String consisting of iCount characters of this starting at one based iStart
 int ParseCount(const TCHAR delimiter);                         // Returns count of delimited fields as specified by char delimiter, i.e., comma, space, tab, etc.
 void Parse(String* pStr, TCHAR delimiter, size_t iParseCount); // Parses this based on delimiter.  Must pass in 1st parameter String* to sufficient number of Strings
 String Remove(const TCHAR* pCharsToRemove);                    // Returns A String With All The chars In A char* Removed (Individual char removal)
 String Remove(const TCHAR* pStrToRemove, bool bCaseSensitive); // Returns a String with 1st parameter removed.  2nd is bool for case sensitivity.
 String Replace(TCHAR* pToRevove, TCHAR* pReplacement);         // Replaces pToRemove with pReplacement in new String.  Replacement can cause String to grow
 
 int InStr                                                      // Returns one based position of pStr in this by case sensitivity and left/right starting position
 (
  const TCHAR* pStr,                                            // pStr -- TCHAR* To Character String To Search For
  bool  blnCaseSensitive,                                       // true / case sensitive; false / case insensitive
  bool  blnStartLeft                                            // Can Specify Search Start From Beginning Or End
 );
 
 int InStr                                                      // Returns one based position of String in this by case sensitivity and left/right starting position
 (
  const String& str,                                            // String To Search For
  bool  blnCaseSensitive,                                       // true / case sensitive; false / case insensitive
  bool  blnStartLeft                                            // Can Specify Search Start From Beginning Or End
 );
 
 #ifdef FORMATTING
    void Format                                                 // dblNumber converted to String, in iArrayCount Buffer Size, with iDecPlaces right of decimal, left or right justified.
    (
     double dblNumber,
     size_t iArraySizeCountOfObjects,                           // Number Of Elements In TCHAR Array.  If You Have A 64 Byte wchar_t buffer, then this number will be 32 for 32 elements.
     size_t iDecimalPlaces,                                     // It includes the NULL.  So its the size of the underlying memory allocation.  Number of decimal places to right of decimal
     TCHAR  cDecimalSeperator,                                  // In US We Use '.'.  In Europe mostly this -- ','
     bool   blnRightJustified                                   // Left/Right Justify Result in iArraySizeCountOfObjects Buffer.
    );
   
    void Format                                                 // double converted to Str, in iArCnt Buf Size, with iDecPlaces right of dec, with cSeperator for thousands, l/r justified.
    (
     double dblNumber,                                          // double to be converted to String
     size_t iArraySizeCountOfObjects,                           // See above.  Size of buffer in elements.  If the buffer is this ... TCHAR szBuf[32], then this parameter is 32.
     size_t iDecimalPlaces,                                     // Number of decimal places to right of decimal
     TCHAR  cThousandsSeperator,                                // Period, comma, whatever
     TCHAR  cDecimalSeperator,                                  // Period, comma, whatever
     bool   blnRightJustified                                   // true is right justified; false left justified in iArraySizeCountOfObjects buffer
    );
   
    void Format                                                 // For integral numbers; can specify justification, field width, and seperator for thousands place
    (
     ssize_t iNumber,                                           // Integral number to format
     size_t  iArraySizeCountOfObjects,                          // See above.  For TCHAR* pBuffer=(TCHAR*)malloc(32*sizeof(TCHAR)), it will be 32
     TCHAR   cThousandsSeperator,                               // Comma, period, whatever
     bool    blnRightJustified                                  // true is right justified; false left justified in iArraySizeCountOfObjects buffer
    );                   
 #endif
 
 #ifdef CONSOLE_OUTPUT
    void Print(bool blnCrLf);                                   // Outputs String with or without CrLf
    void Print(const TCHAR* pText, bool blnCrLf);               // Parameter #1 - leading text literal const; Parameter #2 - with/without CrLf
 #endif
 size_t Len();                                                  // accessor for String::iLen member
 size_t Capacity();                                             // Will be one less than underlying memory allocation
 TCHAR* lpStr();                                                // Same as std::string.c_str().  Returns pointer to underlying Z String
 ~String();                                                     // String Destructor

 private:
 TCHAR*    lpBuffer;                                            // Buffer controlled by String
 size_t    iLen;                                                // Keeps track of present length of String
 size_t    iCapacity;                                           // Keeps track of present capacity of String.  Will be one less element than length of this->lpBuffer memory allocation
};

String operator+(char* lhs, String& rhs);
#endif
// End Strings.h

And here would be the Strings.cpp implementation of the String Class found in Strings.h…

Code: [Select]
// Strings.cpp
#define UNICODE
#define _UNICODE
#include  <windows.h>
#include  "string.h"
#include  "stdio.h"
#include  "Strings.h"
#include  "tchar.h"


String operator+(TCHAR* lhs, String& rhs)    //global function
{
 String sr=lhs;
 sr=sr+rhs;

 return sr;
}


String::String()
{
 this->lpBuffer=new TCHAR[16];
 this->lpBuffer[0]=0;
 this->iLen=0;
 this->iCapacity=15;
}


String::String(const int iSize, bool blnFillNulls)  //Constructor Creates String With Custom Sized
{                                                   //Buffer (rounded up to paragraph boundary)
 int iNewSize      = (iSize/16+1)*16;
 this->lpBuffer    = new TCHAR[iNewSize];
 this->iCapacity   = iNewSize-1;
 this->iLen        = 0;
 this->lpBuffer[0] = _T('\0');
 if(blnFillNulls)
    memset(this->lpBuffer,0,iNewSize*sizeof(TCHAR));
}


String::String(const TCHAR ch)  //Constructor: Initializes with wchar_t
{
 this->iLen=1;
 int iNewSize=MINIMUM_ALLOCATION;
 this->lpBuffer=new TCHAR[iNewSize];
 this->iCapacity=iNewSize-1;
 this->lpBuffer[0]=ch, this->lpBuffer[1]=_T('\0');
}


String::String(const TCHAR* pStr)  //Constructor: Initializes with wchar_t*
{
 this->iLen=_tcslen(pStr);
 int iNewSize=(this->iLen/16+1)*16;
 this->lpBuffer=new TCHAR[iNewSize];
 this->iCapacity=iNewSize-1;
 _tcscpy(lpBuffer,pStr);
}


String::String(const String& s)  //Constructor Initializes With Another String, i.e., Copy Constructor
{
 int iNewSize=(s.iLen/16+1)*16;
 this->iLen=s.iLen;
 this->lpBuffer=new TCHAR[iNewSize];
 this->iCapacity=iNewSize-1;
 _tcscpy(this->lpBuffer,s.lpBuffer);
}


#ifdef INTEGRAL_CONVERSIONS
   String::String(int iNum)
   {
    this->lpBuffer=new TCHAR[16];
    this->iCapacity=15;
    this->iLen=_stprintf(this->lpBuffer,_T("%d"),iNum);
   }

   
   String::String(unsigned int iNum)
   {
    this->lpBuffer=new TCHAR[16];
    this->iCapacity=15;
    this->iLen=_stprintf(this->lpBuffer,_T("%u"),iNum);
   }
   
   
   #ifdef x64
      String::String(size_t iNum)
      {
       this->lpBuffer=new TCHAR[32];
       this->iCapacity=31;
       this->iLen=_stprintf(this->lpBuffer,_T("%Iu"),iNum);
      }


      String::String(ssize_t iNum)
      {
       this->lpBuffer=new TCHAR[32];
       this->iCapacity=31;
       this->iLen=_stprintf(this->lpBuffer,_T("%Id"),iNum);
      }
   #endif
#endif


#ifdef FLOATING_POINT_CONVERSIONS
   String::String(double dblNumber)
   {
    this->lpBuffer=new TCHAR[24];
    this->iCapacity=23;
    this->iLen=FltToTch(this->lpBuffer,dblNumber,24,6,_T('.'),false);
   }
#endif   


String& String::operator=(const TCHAR ch)
{
 this->lpBuffer[0]=ch, this->lpBuffer[1]=_T('\0');
 this->iLen=1;
 return *this;
}


String& String::operator=(const TCHAR* pStr)  // Assign TCHAR* to String
{
 size_t iNewLen=_tcslen(pStr);
 if(iNewLen>this->iCapacity)
 {
    delete [] this->lpBuffer;
    int iNewSize=(iNewLen*EXPANSION_FACTOR/16+1)*16;
    this->lpBuffer=new TCHAR[iNewSize];
    this->iCapacity=iNewSize-1;
 }
 _tcscpy(this->lpBuffer,pStr);
 this->iLen=iNewLen;
   
 return *this;
}


String& String::operator=(const String& strAnother)
{
 if(this==&strAnother)
    return *this;
 if(strAnother.iLen>this->iCapacity)
 {
    delete [] this->lpBuffer;
    int iNewSize=(strAnother.iLen*EXPANSION_FACTOR/16+1)*16;
    this->lpBuffer=new TCHAR[iNewSize];
    this->iCapacity=iNewSize-1;
 }
 _tcscpy(this->lpBuffer,strAnother.lpBuffer);
 this->iLen=strAnother.iLen;

 return *this;
}


#ifdef INTEGRAL_CONVERSIONS
   #ifdef x64
      String& String::operator=(size_t iNum)
      {
       if(this->iCapacity>=24)
          this->iLen=_stprintf(this->lpBuffer,_T("%Iu"),iNum);
       else
       {
          delete [] this->lpBuffer;
          this->lpBuffer=new TCHAR[24];
          this->iCapacity=23;
          this->iLen=_stprintf(this->lpBuffer,_T("%Iu"),iNum);
       }

       return *this;
      }


      String& String::operator=(ssize_t iNum)
      {
       if(this->iCapacity>=32)
          this->iLen=_stprintf(this->lpBuffer,_T("%Id"),iNum);
       else
       {
          delete [] this->lpBuffer;
          this->lpBuffer=new TCHAR[24];
          this->iCapacity=23;
          this->iLen=_stprintf(this->lpBuffer,_T("%Id"),iNum);
       }

       return *this;
      }
   #endif


   String& String::operator=(int iNum)
   {
    if(this->iCapacity>=15)
       this->iLen=_stprintf(this->lpBuffer,_T("%d"),iNum);
    else
    {
       delete [] this->lpBuffer;
       this->lpBuffer=new TCHAR[16];
       this->iCapacity=15;
       this->iLen=_stprintf(this->lpBuffer,_T("%d"),iNum);
    }

    return *this;
   }


   String& String::operator=(unsigned int iNum)
   {
    if(this->iCapacity>=15)
       this->iLen=_stprintf(this->lpBuffer,_T("%u"),iNum);
    else
    {
       delete [] this->lpBuffer;
       this->lpBuffer=new TCHAR[16];
       this->iCapacity=15;
       this->iLen=_stprintf(this->lpBuffer,_T("%u"),iNum);
    }

    return *this;
   }
#endif


#ifdef FLOATING_POINT_CONVERSIONS
   String& String::operator=(double dblNumber)
   {
    if(this->iCapacity>=24)
       this->iLen=FltToTch(this->lpBuffer,dblNumber,24,6,_T('.'),false);
    else
    {
       delete [] this->lpBuffer;
       this->lpBuffer=new TCHAR[24];
       this->iLen=FltToTch(this->lpBuffer,dblNumber,24,6,_T('.'),false);
       this->iCapacity=23;
    }

    return *this;
   }
#endif
   

String String::operator+(const TCHAR ch)
{
 int iNewLen=this->iLen+1;

 String s(iNewLen,false);
 _tcscpy(s.lpBuffer,this->lpBuffer);
 s.lpBuffer[iNewLen-1]=ch;
 s.lpBuffer[iNewLen]=_T('\0');
 s.iLen=iNewLen;

 return s;
}


String String::operator+(const TCHAR* pStr)
{
 int iNewLen=_tcslen(pStr)+this->iLen;
 String s(iNewLen,false);
 _tcscpy(s.lpBuffer,this->lpBuffer);
 _tcscat(s.lpBuffer,pStr);
 s.iLen=iNewLen;

 return s;
}


String String::operator+(String& strRef)
{
 int iNewLen=strRef.iLen+this->iLen;
 String s(iNewLen,false);
 _tcscpy(s.lpBuffer,this->lpBuffer);
 _tcscat(s.lpBuffer,strRef.lpBuffer);
 s.iLen=iNewLen;

 return s;
}


bool String::operator==(String& strRef)
{
 if(_tcscmp(this->lpStr(),strRef.lpStr())==0)
    return true;
 else
    return false;
}


bool String::operator==(const TCHAR* pStr)
{
 if(_tcscmp(this->lpStr(),pStr)==0)
    return true;
 else
    return false;
}


bool String::operator!=(TCHAR* pStr)
{
 if(_tcscmp(this->lpStr(),pStr)==0)
    return false;
 else
    return true;
}


String String::Left(size_t iNum)   //  strncpy = _tcsncpy
{
 if(iNum<this->iLen)
 {
    size_t iNewSize=(iNum*EXPANSION_FACTOR/16+1)*16;
    String sr(iNewSize,true);
    _tcsncpy(sr.lpBuffer,this->lpBuffer,iNum);   // wcsncpy(wchar_t pDest, wchar_t pSource, size_t iCount);
    sr.lpBuffer[iNum]=_T('\0');
    sr.iLen=iNum;
    return sr;
 }
 else
 {
    String sr=*this;
    sr.iLen=this->iLen;
    return sr;
 }
}


String String::Right(size_t iNum)  //Returns Right$(strMain,iNum)
{
 if(iNum<this->iLen)
 {
    size_t iNewSize=(iNum*EXPANSION_FACTOR/16+1)*16;
    String sr(iNewSize,false);
    _tcsncpy(sr.lpBuffer,this->lpBuffer+this->iLen-iNum,iNum);
    sr.lpBuffer[iNum]=_T('\0');
    sr.iLen=iNum;
    return sr;
 }
 else
 {
    String sr=*this;
    sr.iLen=this->iLen;
    return sr;
 }
}


String String::Mid(size_t iStart, size_t iCount)
{
 if(iStart<1)
 {
    String sr;
    return sr;
 }
 if(iCount+iStart>this->iLen)
    iCount=this->iLen-iStart+1;
 String sr(iCount,false);
 _tcsncpy(sr.lpBuffer,this->lpBuffer+iStart-1,iCount);
 sr.lpBuffer[iCount]=_T('\0');
 sr.iLen=iCount;

 return sr;
}


void String::LTrim()
{
 size_t iCt=0;

 for(size_t i=0; i<this->iLen; i++)
 {
     if(this->lpBuffer[i]==9||this->lpBuffer[i]==10||this->lpBuffer[i]==13||this->lpBuffer[i]==32)
        iCt++;
     else
        break;
 }
 if(iCt)
 {
    for(size_t i=iCt; i<=this->iLen; i++)
        this->lpBuffer[i-iCt]=this->lpBuffer[i];
 }
 this->iLen=this->iLen-iCt;
}


void String::RTrim()
{
 int iCt=0;

 for(int i=this->iLen-1; i>0; i--)
 {
     if(this->lpBuffer[i]==9||this->lpBuffer[i]==10||this->lpBuffer[i]==13||this->lpBuffer[i]==32)
        iCt++;
     else
        break;
 }
 this->lpBuffer[this->iLen-iCt]=0;
 this->iLen=this->iLen-iCt;
}


void String::Trim()
{
 this->LTrim();
 this->RTrim();
}


int String::ParseCount(const TCHAR delimiter)   //returns one more than # of
{                                               //delimiters so it accurately
 int iCtr=0;                                    //reflects # of strings delimited
 TCHAR* p;                                      //by delimiter.

 p=this->lpBuffer;
 while(*p)
 {
   if(*p==delimiter)
      iCtr++;
   p++;
 }

 return ++iCtr;
}


void String::Parse(String* pStr, TCHAR delimiter, size_t iParseCount)
{
 TCHAR* pBuffer=new TCHAR[this->iLen+1];
 if(pBuffer)
 {
    TCHAR* p=pBuffer;
    TCHAR* c=this->lpBuffer;
    while(*c)
    {
       if(*c==delimiter)
          *p=0;
       else
          *p=*c;
       p++, c++;
    }
    *p=0, p=pBuffer;
    for(size_t i=0; i<iParseCount; i++)
    {
        pStr[i]=p;
        p=p+pStr[i].iLen+1;
    }
    delete [] pBuffer;
 }
}


int iMatch(TCHAR* pThis, const TCHAR* pStr, bool blnCaseSensitive, bool blnStartBeginning, int i, int iParamLen)
{
 if(blnCaseSensitive)
 {
    if(_tcsncmp(pThis+i,pStr,iParamLen)==0)   //_tcsncmp
       return i+1;
    else
       return 0;
 }
 else
 {
    if(_tcsnicmp(pThis+i,pStr,iParamLen)==0)  //__tcsnicmp
       return i+1;
    else
       return 0;
 }
}


int String::InStr(const TCHAR* pStr, bool blnCaseSensitive, bool blnStartBeginning)
{
 int i,iParamLen,iRange,iReturn;

 if(*pStr==0)
    return 0;
 iParamLen=_tcslen(pStr);
 iRange=this->iLen-iParamLen;
 if(blnStartBeginning)
 {
    if(iRange>=0)
    {
       for(i=0; i<=iRange; i++)
       {
           iReturn=iMatch(this->lpBuffer,pStr,blnCaseSensitive,blnStartBeginning,i,iParamLen);
           if(iReturn)
              return iReturn;
       }
    }
 }
 else
 {
    if(iRange>=0)
    {
       for(i=iRange; i>=0; i--)
       {
           iReturn=iMatch(this->lpBuffer,pStr,blnCaseSensitive,blnStartBeginning,i,iParamLen);
           if(iReturn)
              return iReturn;
       }
    }
 }

 return 0;
}


int String::InStr(const String& s, bool blnCaseSensitive, bool blnStartBeginning)
{
 int i,iParamLen,iRange,iReturn;

 if(s.iLen==0)
    return 0;
 iParamLen=s.iLen;
 iRange=this->iLen-iParamLen;
 if(blnStartBeginning)
 {
    if(iRange>=0)
    {
       for(i=0; i<=iRange; i++)
       {
           iReturn=iMatch(this->lpBuffer,s.lpBuffer,blnCaseSensitive,blnStartBeginning,i,iParamLen);
           if(iReturn)
              return iReturn;
       }
    }
 }
 else
 {
    if(iRange>=0)
    {
       for(i=iRange; i>=0; i--)
       {
           iReturn=iMatch(this->lpBuffer,s.lpBuffer,blnCaseSensitive,blnStartBeginning,i,iParamLen);
           if(iReturn)
              return iReturn;
       }
    }
 }

 return 0;
}


String String::Remove(const TCHAR* pStr) // Individual character removal, i.e., if this contains the
{                                        // English alphabet...
 unsigned int i,j,iStrLen,iParamLen;     //
 TCHAR *pThis, *pThat, *p;               // abcdefghijklmnopqrstuvwxyz
 bool blnFoundBadTCHAR;                  //
                                         // ...and pStr is this...
 iStrLen=this->iLen;                     //
 String sr((int)iStrLen,false);          // "aeiou"
 iParamLen=_tcslen(pStr);                //
 pThis=this->lpBuffer;                   // ,,,then this will be returned in the returned String...
 p=sr.lpStr();                           //
 for(i=0; i<iStrLen; i++)                // bcdfghjklmnpqrstvwxyz
 {                                       //
     pThat=(TCHAR*)pStr;                 // That is, the vowels will be individually removed
     blnFoundBadTCHAR=false;
     for(j=0; j<iParamLen; j++)
     {
         if(*pThis==*pThat)
         {
            blnFoundBadTCHAR=true;
            break;
         }
         pThat++;
     }
     if(!blnFoundBadTCHAR)
     {
        *p=*pThis;
        p++;
        *p=_T('\0');
     }
     pThis++;
 }
 sr.iLen=_tcslen(sr.lpStr());

 return sr;
}


String String::Remove(const TCHAR* pMatch, bool blnCaseSensitive)
{
 size_t i,iCountMatches=0,iCtr=0;

 size_t iLenMatch=_tcslen(pMatch);
 for(i=0; i<this->iLen; i++)
 {
     if(blnCaseSensitive)
     {
        if(_tcsncmp(lpBuffer+i,pMatch,iLenMatch)==0)  //_tcsncmp
           iCountMatches++;
     }
     else
     {
        if(_tcsnicmp(lpBuffer+i,pMatch,iLenMatch)==0) //__tcsnicmp
           iCountMatches++;
     }
 }
 int iAllocation=this->iLen-(iCountMatches*iLenMatch);
 String sr(iAllocation,false);
 for(i=0; i<this->iLen; i++)
 {
     if(blnCaseSensitive)
     {
        if(_tcsncmp(this->lpBuffer+i,pMatch,iLenMatch)==0)
           i+=iLenMatch-1;
        else
        {
           sr.lpBuffer[iCtr]=this->lpBuffer[i];
           iCtr++;
        }
        sr.lpBuffer[iCtr]=_T('\0');
     }
     else
     {
        if(_tcsnicmp(this->lpBuffer+i,pMatch,iLenMatch)==0)
           i+=iLenMatch-1;
        else
        {
           sr.lpBuffer[iCtr]=this->lpBuffer[i];
           iCtr++;
        }
        sr.lpBuffer[iCtr]=_T('\0');
     }
 }
 sr.iLen=iCtr;
 return sr;
}


String String::Replace(TCHAR* pMatch, TCHAR* pNew)  //strncmp = _tcsncmp
{
 size_t i,iLenMatch,iLenNew,iCountMatches,iExtra,iExtraLengthNeeded,iAllocation,iCtr;
 iLenMatch=_tcslen(pMatch);
 iCountMatches=0, iAllocation=0, iCtr=0;
 iLenNew=_tcslen(pNew);
 if(iLenNew==0)
 {
    String sr=this->Remove(pMatch,true); //return
    return sr;
 }
 else
 {
    iExtra=iLenNew-iLenMatch;
    for(i=0; i<this->iLen; i++)
    {
        if(_tcsncmp(lpBuffer+i,pMatch,iLenMatch)==0)
           iCountMatches++;  //Count how many match strings
    }
    iExtraLengthNeeded=iCountMatches*iExtra;
    iAllocation=this->iLen+iExtraLengthNeeded;
    String sr(iAllocation,false);
    for(i=0; i<this->iLen; i++)
    {
        if(_tcsncmp(this->lpBuffer+i,pMatch,iLenMatch)==0)
        {
           _tcscpy(sr.lpBuffer+iCtr,pNew);
           iCtr+=iLenNew;
           i+=iLenMatch-1;
        }
        else
        {
           sr.lpBuffer[iCtr]=this->lpBuffer[i];
           iCtr++;
        }
        sr.lpBuffer[iCtr]=_T('\0');
    }
    sr.iLen=iCtr;
    return sr;
 }
}


#ifdef FORMATTING
   void String::Format(double dblNumber, size_t iArraySize, size_t iDecPlaces, TCHAR cDecimalSeperator, bool blnRightJustified)
   {
    if(this->iCapacity<iArraySize-1)
    {
       delete [] this->lpBuffer;
       this->lpBuffer=new TCHAR[iArraySize];
       this->iCapacity=iArraySize-1;
    }   
    this->iLen=FltToTch(this->lpBuffer,dblNumber,iArraySize,iDecPlaces,cDecimalSeperator,blnRightJustified);
   }

   
   void String::Format(double dblNumber, size_t iArraySizeCountOfObjects, size_t iDecimalPlaces, TCHAR cThousandsSeperator, TCHAR cDecimalSeperator, bool blnRightJustified)
   {
    size_t iLen=0,iPadding=0;
    int iDecPt=0;
    _int64 iInt;
   
    if(this->iCapacity<iArraySizeCountOfObjects-1)
    {
       delete [] this->lpBuffer;
       size_t iNewSize=(iArraySizeCountOfObjects/16+1)*16;
       this->lpBuffer=new TCHAR[iNewSize];
       this->iCapacity=iNewSize-1;
       this->lpBuffer[0]=0;
    }
    String s1(24,true);
    s1.Format(dblNumber,24,iDecimalPlaces,cDecimalSeperator,false);
    s1.Trim();
    if(iDecimalPlaces)
    {
       iDecPt=s1.InStr(cDecimalSeperator,false,false);
       String s2=s1.Left(iDecPt-1);
       iInt=_ttoi64(s2.lpStr());
    }
    else
       iInt=_ttoi64(s1.lpStr());
    String s3(28,true);
    if(iInt==0 && dblNumber<0.0)
       s3=_T("-0");
    else
       s3.Format((ssize_t)iInt,28,cThousandsSeperator,false);
    s3.Trim();
    if(iDecimalPlaces)
       s3=s3+s1.Mid(iDecPt,iDecimalPlaces+1);
    iLen=s3.Len();
    if(iLen>this->iCapacity)
       return;
    iPadding=iArraySizeCountOfObjects-iLen-1;
    if(blnRightJustified)
    {
       for(size_t i=0; i<iPadding; i++)
           this->lpBuffer[i]=32;
       this->lpBuffer[iPadding]=0;
       _tcscat(this->lpBuffer, s3.lpStr()); 
    }
    else
    {
       _tcscpy(this->lpBuffer,s3.lpStr());
       for(size_t i=iLen; i<iArraySizeCountOfObjects-1; i++)
           this->lpBuffer[i]=32;
       this->lpBuffer[iArraySizeCountOfObjects-1]=0;   
    }
    this->iLen=_tcslen(this->lpBuffer);           
   }
 
   
   void String::Format(ssize_t iNumber, size_t iArraySizeCountOfObjects, TCHAR cThousandsSeperator, bool blnRightJustified)
   {
    bool blnPositive;
    TCHAR szBuf1[28];
    TCHAR szBuf2[28];
    size_t iDigit=0;
    size_t iLen=0;
    size_t iCtr=1;
    size_t j=0;

    memset(szBuf1,0,28*sizeof(TCHAR));
    memset(szBuf2,0,28*sizeof(TCHAR));
    if(iNumber<0)
       blnPositive=false;
    else
       blnPositive=true;
    #ifdef x64
       iNumber=_abs64(iNumber);
    #else
       iNumber=abs(iNumber);
    #endif
    #ifdef x64
       _stprintf(szBuf1,_T("%Iu"),(size_t)iNumber);
    #else
       _stprintf(szBuf1,_T("%u"),(size_t)iNumber);
    #endif
    _tcsrev(szBuf1);
    iLen=_tcslen(szBuf1);
    for(size_t i=0; i<iLen; i++)
    {
        if(iCtr==3)
        {
           iDigit++;
           szBuf2[j]=szBuf1[i];
           if(iDigit<iLen)
           {
              j++;
              szBuf2[j]=cThousandsSeperator;
           }
           j++, iCtr=1;
        }
        else
        {
           iDigit++;
           szBuf2[j]=szBuf1[i];
           j++, iCtr++;
        }
    }
    _tcsrev(szBuf2); // Done With Creating String With Commas
    memset(szBuf1,0,28*sizeof(TCHAR));  // Reuse szBuf1
    if(blnPositive)
       _tcscpy(szBuf1,szBuf2);
    else
    {
       szBuf1[0]=_T('-');
       _tcscat(szBuf1,szBuf2);
    }   
    size_t iRequiredBytes;         // Find out which of two is larger - length of string necessary
    iLen=_tcslen(szBuf1);          // or iFldLen
    if(iArraySizeCountOfObjects<=iLen)
       return;
    if(iArraySizeCountOfObjects>(int)iLen)
       iRequiredBytes=iArraySizeCountOfObjects;
    else
       iRequiredBytes=iLen;
    if(iRequiredBytes>(size_t)this->iCapacity)
    {
       delete [] this->lpBuffer;
       int iNewSize=(iRequiredBytes*EXPANSION_FACTOR/16+1)*16;
       this->lpBuffer=new TCHAR[iNewSize];
       this->iCapacity=iNewSize-1;
    }
    memset(this->lpBuffer,0,this->iCapacity*sizeof(TCHAR)); 
    ssize_t iDifference=iArraySizeCountOfObjects-iLen-1;
    if(blnRightJustified)                                             
    {
       if(iDifference > 0)
       {
          for(size_t i=0; i<(size_t)iDifference; i++)
              this->lpBuffer[i]=_T(' ');  // 32
       }
       _tcscat(this->lpBuffer,szBuf1);
    }
    else
    {
       _tcscpy(this->lpBuffer,szBuf1);
       if(iDifference>0)
       {
          for(size_t i=iLen; i<iDifference+iLen; i++)
              this->lpBuffer[i]=_T(' ');  // 32
       }
    }
    this->iLen=_tcslen(this->lpBuffer);
   }
#endif


TCHAR* String::lpStr()
{
 return this->lpBuffer;
}


size_t String::Len(void)
{
 return this->iLen;
}


size_t String::Capacity(void)
{
 return this->iCapacity;
}


#ifdef CONSOLE_OUTPUT
   void String::Print(bool blnCrLf)
   {
    _tprintf(_T("%s"),this->lpBuffer);
    if(blnCrLf)
       _tprintf(_T("\n"));
   }

   
   void String::Print(const TCHAR* pStr, bool blnCrLf)
   {
    _tprintf(_T("%s%s"),pStr,lpBuffer);
    if(blnCrLf)
       _tprintf(_T("\n"));
   }
#endif


String::~String()
{
 delete [] this->lpBuffer;
}
// End Strings.cpp


continued...

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #3 on: March 23, 2016, 01:24:46 AM »
     In some of my previous writings here I described one of the downsides of having one’s own String Class as being the fact that it is always changing.  The above code is witness to that.  This incarnation of my string class is again different from any others I have posted.  I’m always working on it.  Since this floating point issue bedeviled me so, I ended up reworking my formatting functions to improve them from what I had before only recently a few weeks ago, and I ended up removing two String::Money() members which were simply special cases of my other formatting members where only two digits to the right of the decimal point were required.  Also, I added the capability for the user to specify which symbol is to be used for decimal points and thousands separators.

     Now here’s the scoop on some defines you’ll see in both Strings.h and Strings.cpp.  Since I take such great pleasure in minimizing the sizes of my executables it has never been lost on me that some of my projects where I use my String Class use certain portions of it only, while other projects use different portions, i.e., member functions.  Related to this, when one is working with various utulity code the thought often crosses one’s mind that the code being worked on might be a canidate for inclusion in my String Class.  Of course, all this can cause the String Class to grow and grow and arrive at a point where no particular project uses all of it.  My solution to this was to include various ‘chunks’ of code within #ifdef conditional compilation blocks, so that I could eliminate certain portions of code or functionality in my String Class if not needed in some given project.  And these have changed over the years.  What I presently have now are these…

#define INTEGRAL_CONVERSIONS                                   

This allows direct assignment of integral numeric values to Strings through the operator= overload, or conctructor notation, e.g., String s=54321, or String s(12345).  It can be thought of analgous to PowerBASIC’s Str$() function.  The C++ Standard Library isn’t set up to do this, I don’t believe.  Not that mine’s better for doing it, but rather just different.

//#define FLOATING_POINT_CONVERSIONS                             

This is the same as above but for dealing with floating point numbers.  It allows direct assignment of floating point values to Strings, e.g., String s(3.14159) or String s = -98765.43210. 

//#define FORMATTING                                             

This is kind of a replacement for PowerBASIC’s Format$() function.  There are three overloaded format members like so…

Code: [Select]
#ifdef FORMATTING
    void Format                                                 // dblNumber converted to String, in iArrayCount Buffer Size, with iDecPlaces right of decimal, left or right justified.
    (
     double dblNumber,
     size_t iArraySizeCountOfObjects,                           // Number Of Elements In TCHAR Array.  If You Have A 64 Byte wchar_t buffer, then this number will be 32 for 32 elements.
     size_t iDecimalPlaces,                                     // It includes the NULL.  So its the size of the underlying memory allocation.  Number of decimal places to right of decimal
     TCHAR  cDecimalSeperator,                                  // In US We Use '.'.  In Europe mostly this -- ','
     bool   blnRightJustified                                   // Left/Right Justify Result in iArraySizeCountOfObjects Buffer.
    );
   
    void Format                                                 // double converted to Str, in iArCnt Buf Size, with iDecPlaces right of dec, with cSeperator for thousands, l/r justified.
    (
     double dblNumber,                                          // double to be converted to String
     size_t iArraySizeCountOfObjects,                           // See above.  Size of buffer in elements.  If the buffer is this ... TCHAR szBuf[32], then this parameter is 32.
     size_t iDecimalPlaces,                                     // Number of decimal places to right of decimal
     TCHAR  cThousandsSeperator,                                // Period, comma, whatever
     TCHAR  cDecimalSeperator,                                  // Period, comma, whatever
     bool   blnRightJustified                                   // true is right justified; false left justified in iArraySizeCountOfObjects buffer
    );
   
    void Format                                                 // For integral numbers; can specify justification, field width, and seperator for thousands place
    (
     ssize_t iNumber,                                           // Integral number to format
     size_t  iArraySizeCountOfObjects,                          // See above.  For TCHAR* pBuffer=(TCHAR*)malloc(32*sizeof(TCHAR)), it will be 32
     TCHAR   cThousandsSeperator,                               // Comma, period, whatever
     bool    blnRightJustified                                  // true is right justified; false left justified in iArraySizeCountOfObjects buffer
    );                   
 #endif

The bottom one takes an integral 1st parameter and the upper two doubles.  With these you can line up decimal points (or comma seperators, which are used in Europe), include commas every three digits, specify left/right justification within a field width you specify, specify desired number of decimal places, etc.

     So in the following programs I’ll list program sizes with and without these various capabilities.  Obviously, if a program is using floating point numbers that capability needs to be brought in through the above defines.  So given that explanation, lets move to Demo5.cpp…

Code: [Select]
// cl Demo5.cpp Strings.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib
// 4,608 Bytes With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out
// 5,120 Bytes With Full String Class
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 int iParseCount=0;
 String* pStrs=NULL;
 String s1;
 
 s1=_T("Zero, One, Two, Three, Four, Five");
 s1.Print(_T("s1 = "),true);
 iParseCount=s1.ParseCount(_T(','));
 _tprintf(_T("iParseCount = %d\n\n"),iParseCount);
 pStrs=new String[iParseCount];
 s1.Parse(pStrs, _T(','), iParseCount);
 for(int i=0; i<iParseCount; i++)
 {
     pStrs[i].LTrim();
     pStrs[i].Print(true);
 }
 delete [] pStrs;
 getchar();
 
 return 0;
}

// Output:
// ==========================
// s1 = Zero, One, Two, Three, Four, Five
// iParseCount = 6

// Zero
// One
// Two
// Three
// Four
// Five


     Its just a parsing example using my String::Parse() member, which is similar to PowerBASIC’s Parse statement.  Quite a few significant things to discuss here.  We’ll start with this strange creature in the command line compilation string….

Code: [Select]
/Zc:sizedDealloc-

The ‘/Zc:sizedDealloc-‘ switch is required by VC19 (from Visual Studio 2015), but is not required by earlier versions.  It is essentially a ‘workaround’ for a VC19 bug which Microsoft has not as of this writing fixed although they’ve acknolowedged it.  The switch turns off the newest changes to the C++ ‘Placement New’ capabilities of the newest C++ Standard.  What triggers the need for it in this program are these lines…

Code: [Select]
String* pStrs=NULL;
delete [] pStrs;

…which are somewhat unusual in that none of my other String Class members other than String::Parse() take as parameters pointers to String Class objects.  I lost a lot of days on this one before I solved it.  So we’re somewhere around 5 k with that code.  Lets take a look at the STL based C++ Standard Library weird science version of the above 

Code: [Select]
// cl StdLibParse1.cpp /O1 /Os /MT /EHsc
// 200,192 Bytes
#include <iostream>
#include <sstream>

int main()
{
 std::string input = "Zero, One, Two, Three, Four, Five, Six";
 std::istringstream ss(input);
 std::string token;
 while(std::getline(ss, token, ','))
 {
    std::cout << token << '\n';
 }
 
 return 0;
}


#if 0

Output:
=======
Zero
 One
 Two
 Three
 Four
 Five
 Six

 #endif

That’s coming in 200 k or 195 k bigger than with TCLib.lib.  Moving on, Demo6.cpp shows use of one of the overloaded constructors to construct a new String out of an existing String…

Code: [Select]
// cl Demo6.cpp Strings.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib
// 3,584 Bytes With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out
// 4,096 Bytes With Full String Class
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 String s1(_T("Hello, World!"));
 String s2(s1);
 
 s1.Print(true);
 s2.Print(true);
 getchar();
 
 return 0;
}

// Output:
// =============
// Hello, World!
// Hello, World!

     String s1 is instantiated/constructed with the text string literal “Hello, World!”.  String s2 is constructed out of String s1.  Lets play with some big numbers again.  Here is Demo7.cpp where we’ll use some of the overloaded String Constructors to create Strings out of numbers…

[CODE]
// cl Demo7.cpp Strings.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib
// 5,632 Bytes Bytes With Full String Class
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 String s1(123456789);
 s1.Print(true);
 String s2(-123456789);
 s2.Print(true);
 #ifdef x64
    String s3(12345678987654);
    String s4(-12345678987654);
    s3.Print(true);
    s4.Print(true);
 #endif
 String s5(-3.14159);
 s5.Print(true);
 getchar();
 
 return 0;
}

// Output:
// =============
// 123456789
// -123456789
// 12345678987654
// -12345678987654
// -3.141590


     Note that I had to put String s3 and s4 inside conditional compilation directives because I didn’t create an overloaded constructor for _int64 when compiling x86 mode.  Its an addition I should probably make (yet another version of my Strig Class! – See what I mean?).  Demo8.cpp is more of the same, but we’ll start getting into the operator= functions.  Here we construct a String object with a TCHAR and use operator=…

Code: [Select]
// cl Demo8.cpp Strings.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib
// 4,096 Bytes  With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out
// 4,608 Bytes With Full String Class
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 String s1,s2,s3;
 
 s1=_T('A');
 s2=_T("Hello, World!");
 s3=s2;
 s1.Print(true);
 s2.Print(true);
 s3.Print(true);
 getchar();
 
 return 0;
}

// Output:
// =============
// A
// Hello, World!
// Hello, World!

In Demo9.cpp we’ll play with big numbers again, but use operator= overloads to assign the numbers to String objects…

Code: [Select]
// cl Demo9.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 6,144 Bytes
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 String s1,s2,s3,s4,s5;
 
 s1 =  123456789;
 s1.Print(true);
 s2 = -123456789;
 s2.Print(true);
 #ifdef x64
    s3 =  12345678987654;
    s4 = -12345678987654;
    s3.Print(true);
    s4.Print(true);
 #endif   
 s5 =  3.141590;
 s5.Print(true);
 getchar();
 
 return 0;
}

// Output:
// =============
// 123456789
// -123456789
// 12345678987654
// -12345678987654
// 3.141590

Demo10.cpp is a fun little program where we’ll start getting into operator+ overloads…

Code: [Select]
// cl Demo10.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 4,096 Bytes With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out
// 4,608 Bytes With Full String Class
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 String s1;
 
 for(size_t i=65; i<=90; i++)
 {
     s1=s1+i;
     s1.Print(true);
 }
 getchar();
 
 return 0;
}

// Output:
// =============
// A
// AB
// ABC
// ABCD
// ABCDE
// ABCDEF
// ABCDEFG
// ABCDEFGH
// ABCDEFGHI
// ABCDEFGHIJ
// ABCDEFGHIJK
// ABCDEFGHIJKL
// ABCDEFGHIJKLM
// ABCDEFGHIJKLMN
// ABCDEFGHIJKLMNO
// ABCDEFGHIJKLMNOP
// ABCDEFGHIJKLMNOPQ
// ABCDEFGHIJKLMNOPQR
// ABCDEFGHIJKLMNOPQRS
// ABCDEFGHIJKLMNOPQRST
// ABCDEFGHIJKLMNOPQRSTU
// ABCDEFGHIJKLMNOPQRSTUV
// ABCDEFGHIJKLMNOPQRSTUVW
// ABCDEFGHIJKLMNOPQRSTUVWX
// ABCDEFGHIJKLMNOPQRSTUVWXY
// ABCDEFGHIJKLMNOPQRSTUVWXYZ

Demo11.cpp is kind of fun too, as we’ll disassemble a String with Parse, and then put it back together again from its disassembled parts with operator+ calls on String objects.  And we’ll be able to make use of good ‘ol Right$ ( String::Right() ) in the process… 

Code: [Select]
// cl Demo11.cpp Strings.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib
// 6,144 Bytes With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out
// 6,656 Bytes With Full Stri ng Class
#define  UNICODE
#define  _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"

int main()
{
 String* pStrs=NULL;
 int iParseCount=0;
 String s1,s2;
 
 s1=_T("Zero,  One, Two  , Three ,  Four, Five   , Six, Seven, Eight , Nine, Ten");
 s1.Print(_T("s1 = "), true);
 iParseCount = s1.ParseCount(_T(','));
 _tprintf(_T("iParseCount = %d\n\n"),iParseCount);
 pStrs = new String[iParseCount];
 s1.Parse(pStrs, _T(','), iParseCount);
 for(int i=0; i<iParseCount; i++)
 {
     pStrs[i].Trim();
     pStrs[i].Print(true);
     s2 = s2 + _T(',') + pStrs[i];
 }
 s2.Print(_T("\ns2 = "),true);
 s2=s2.Right(s2.Len()-1);
 s2.Print(_T("s2 = "),true);
 delete [] pStrs;
 getchar();
 
 return 0;
}

// Output:
// =============
// s1 = Zero,  One, Two  , Three ,  Four, Five   , Six, Seven, Eight , Nine, Ten
// iParseCount = 11
//
// Zero
// One
// Two
// Three
// Four
// Five
// Six
// Seven
// Eight
// Nine
// Ten
//
// s2 = ,Zero,One,Two,Three,Four,Five,Six,Seven,Eight,Nine,Ten
// s2 = Zero,One,Two,Three,Four,Five,Six,Seven,Eight,Nine,Ten

Demo12.cpp makes use of good old InStr() in C++ String Class form….

Code: [Select]
// cl Demo12.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 5,120 Bytes With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out
// 5,632 Bytes With Full Stri ng Class
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 int iFound=0;
 String s1;
 
 s1=_T("C++ Can Be A Real Pain In The Butt To Learn!");
 s1.Print(_T("s1 = "),true);
 iFound=s1.InStr(_T("real"),true,true);
 if(!iFound)
    _tprintf(_T("Couldn't Locate 'real' In s1 With A Case Sensitive Search!\n"));
 iFound=s1.InStr(_T("real"),false,true); 
 _tprintf(_T("But A Case Insensitive Search Located 'real' At One Based Offset %d In s1!\n"),iFound);
 getchar();
 
 return 0;
}

// Output:
// ==========================================================================
// s1 = C++ Can Be A Real Pain In The Butt To Learn!
// Couldn't Locate 'real' In s1 With A Case Sensitive Search!
// But A Case Insensitive Search Located 'real' At One Based Offset 14 In s1!

Demo13.cpp makes use of one of the two overloads of String::Remove() where the vowels “aeiou” are individually removed from the String containing the English alphabet….

Code: [Select]
// cl Demo13.cpp Strings.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib
// 5,120 Bytes With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out
// 5,120 Bytes With Full String Class
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 String s1=_T("abcdefghijklmnopqrstuvwxyz");
 String s2=_T("aeiou");
 String s3=s1.Remove(s2.lpStr());
 s1.Print(true);
 s3.Print(true);
 getchar();
 
 return 0;
}

// Output:
// ==========================
// abcdefghijklmnopqrstuvwxyz
// bcdfghjklmnpqrstvwxyz

Demo14.cpp uses the other String::Remove() overload to remove the sub-string “Real” from a larger String…

Code: [Select]
// cl Demo14.cpp Strings.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib
// 5,120 Bytes With INTEGRAL_CONVERSIONS, FLOATING_POINT_CONVERSIONS, And FORMATTING Remmed Out
// 5,632 Bytes With Full String Class
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int main()
{
 String s1=_T("C++ Can Be A Real Pain In The Butt To Learn!");
 s1.Print(true);
 String s2=s1.Remove(_T("Real"), true);
 s2.Print(true);
 getchar();
 
 return 0;
}

// Output:
// ============================================
// C++ Can Be A Real Pain In The Butt To Learn!
// C++ Can Be A  Pain In The Butt To Learn!

We’re back to big numbers again with Demo15.cpp, where we’ll make first use of one of my String::Format() members which converts a large double precision floating point number to a String with comma separators for thousands places, and a dot for the decimal place seperator…

Code: [Select]
// cl Demo15.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 9,216 Bytes With Full String Class, Which Is Mostly Needed
#define  UNICODE
#define  _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int _tmain()
{
 double dblNum=-1234567898786.7898765;
 String s1;
 
 s1.Format(dblNum, 24, 3, _T(','), _T('.'), true);
 s1.Print(_T("s1="),true);
 getchar();
 
 return 0;
}

// Output:
// ==========================
// s1= -1,234,567,898,786.790

My last example, Demo16.cpp, uses my String::Format() member in a double For Loop to iterate through an array of doubles and format the numbers with zero, one or two decimal places to right of decimal.  I added a number to the array to test the rounding when the rounding digit is a five.  So both of the last two numbers in the array, that is, 1.985 and 1.9754321 both round to 1.98 when two digits are specified to the right of the decimal.  I thought this was how it was supposed to be done.  But I’m not an expert on this matter.  If someone tells me differently I can easily change the code…

Code: [Select]
// cl Demo16.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 9,728 Bytes With Full String Class, Which Is Mostly Needed
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int _tmain()
{
 double dblNums[]={123456.78912, -456.9876, 9999.99999, -0.987654, 0.0, 1.985, 1.9754321};
 String s1;
 
 for(size_t i=0; i<=2; i++)
 {
     for(size_t j=0; j<sizeof(dblNums)/sizeof(dblNums[0]); j++)
     {
         s1.Format(dblNums[j], 12, i, _T(','), _T('.'), true);
         s1.Print(true);
     }
     printf("\n\n");
 }
 getchar();
 
 return 0;
}

//     Output
// ==========
//    123,457
//       -457
//     10,000
//         -1
//          0
//          2
//          2
//
//
//  123,456.8
//     -457.0
//   10,000.0
//       -1.0
//        0.0
//        2.0
//        2.0
//
//
// 123,456.79
//    -456.99
//  10,000.00
//      -0.99
//       0.00
//       1.98
//       1.98

     Well, that’s all the little demos.  Its time to post all the source code necessary to build TCLib.lib…

continued...
« Last Edit: March 23, 2016, 01:49:49 AM by Frederick J. Harris »

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #4 on: March 23, 2016, 01:27:16 AM »
Code: [Select]
//========================================================================================
//                 Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                              By Fred Harris, January 2016
//
// cl crt_con_a.cpp /D "_CRT_SECURE_NO_WARNINGS" /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
//========================================================================================
#include <windows.h>
#pragma comment(linker, "/defaultlib:kernel32.lib")
#pragma comment(linker, "/nodefaultlib:libc.lib")
#pragma comment(linker, "/nodefaultlib:libcmt.lib")
extern "C" int __cdecl  main();

extern "C" void __cdecl mainCRTStartup()
{
 int iReturn = main();
 ExitProcess(iReturn);
}


Code: [Select]
//========================================================================================
//                 Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                              By Fred Harris, January 2016
//
// cl crt_con_w.cpp /D "_CRT_SECURE_NO_WARNINGS" /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
//========================================================================================
#include <windows.h>
#pragma comment(linker, "/defaultlib:kernel32.lib")
#pragma comment(linker, "/nodefaultlib:libc.lib")
#pragma comment(linker, "/nodefaultlib:libcmt.lib")
extern "C" int __cdecl  wmain();

extern "C" void __cdecl wmainCRTStartup()
{
 int iReturn = wmain();
 ExitProcess(iReturn);
}


Code: [Select]
//========================================================================================
//                 Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                              By Fred Harris, January 2016
//
// cl crt_win_a.cpp /D "_CRT_SECURE_NO_WARNINGS" /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
//========================================================================================
#include <windows.h>
#pragma comment(linker, "/defaultlib:kernel32.lib")
#pragma comment(linker, "/nodefaultlib:libc.lib")
#pragma comment(linker, "/nodefaultlib:libcmt.lib")
int WINAPI WinMain(HINSTANCE, HINSTANCE, LPSTR, int);

extern "C" void __cdecl WinMainCRTStartup(void)
{
 int iReturn = WinMain(GetModuleHandle(NULL),NULL,NULL,SW_SHOWDEFAULT);
 ExitProcess(iReturn);
}

Code: [Select]
//========================================================================================
//                 Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                              By Fred Harris, January 2016
//
// cl crt_win_w.cpp /D "_CRT_SECURE_NO_WARNINGS" /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
//========================================================================================
#include <windows.h>
#pragma comment(linker, "/defaultlib:kernel32.lib")
#pragma comment(linker, "/nodefaultlib:libc.lib")
#pragma comment(linker, "/nodefaultlib:libcmt.lib")
int WINAPI wWinMain(HINSTANCE, HINSTANCE, LPWSTR, int);

extern "C" void __cdecl wWinMainCRTStartup(void)
{
 int iReturn = wWinMain(GetModuleHandle(NULL),NULL,NULL,SW_SHOWDEFAULT);
 ExitProcess(iReturn);
}


Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//                     cl memset.cpp /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
#include "memory.h"

void* __cdecl memset(void* p, int c, size_t count)
{
 char* pCh=(char*)p;
 for(size_t i=0; i<count; i++)
     pCh[i]=(char)c;
 return p;
}

wchar_t* __cdecl wmemset(wchar_t* p, wchar_t c, size_t count)
{
 for(size_t i=0; i<count; i++)
     p[i]=c;
 return p;
}


Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//
//                          LIBCTINY -- Matt Pietrek 2001
//                           MSDN Magazine, January 2001
//
//                              With Help From Mike_V
//                       
//                           By Fred Harris, January 2016
//
//                    cl newdel.cpp /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
#include <windows.h>

void* __cdecl operator new(size_t s)
{
 return HeapAlloc(GetProcessHeap(), 0, s);
}

void  __cdecl operator delete(void* p)
{
 HeapFree(GetProcessHeap(), 0, p);
}

void* operator new [] (size_t s)
{
 return HeapAlloc(GetProcessHeap(), 0, s);
}

void operator delete [] (void* p)
{
 HeapFree(GetProcessHeap(), 0, p);
}


Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//                    cl printf.cpp /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
#include <windows.h>
#include <stdarg.h>
#include "stdio.h"
#pragma comment(linker, "/defaultlib:user32.lib") 


int __cdecl printf(const char* format, ...)
{
 char szBuff[1024];
 DWORD cbWritten;
 va_list argptr;
 int retValue;
         
 va_start(argptr, format);
 retValue = wvsprintf(szBuff, format, argptr);
 va_end(argptr);
 WriteFile(GetStdHandle(STD_OUTPUT_HANDLE), szBuff, retValue, &cbWritten, 0);

 return retValue;
}


int __cdecl wprintf(const wchar_t* format, ...)
{
 wchar_t szBuffW[1024];
 char szBuffA[1024];
 int iChars,iBytes;
 DWORD cbWritten;
 va_list argptr;
           
 va_start(argptr, format);
 iChars = wvsprintfW(szBuffW, format, argptr);
 va_end(argptr);
 iBytes=WideCharToMultiByte(CP_ACP,0,szBuffW,iChars,szBuffA,1024,NULL,NULL);
 WriteFile(GetStdHandle(STD_OUTPUT_HANDLE), szBuffA, iBytes, &cbWritten, 0);

 return iChars;
}

Code: [Select]
//=============================================================
//   Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                By Fred Harris, February 2016
//
// cl sprintf.cpp /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN   
//=============================================================
// cl sprintf.cpp /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
#include <windows.h>
#include <stdarg.h>
#include "stdio.h"
#define EOF (-1)
#pragma comment(linker, "/defaultlib:user32.lib")

int __cdecl sprintf(char* buffer, const char* format, ...)
{
 va_list argptr;
 int retValue;
           
 va_start(argptr, format);
 retValue = wvsprintfA(buffer, format, argptr);
 va_end(argptr);

 return retValue;
}

int __cdecl swprintf(wchar_t* buffer, const wchar_t* format, ...)
{
 va_list argptr;
 int retValue;
         
 va_start(argptr, format);
 retValue = wvsprintfW(buffer, format, argptr);
 va_end(argptr);

 return retValue;
}


Code: [Select]
//==============================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//     cl _strnicmp.cpp /D "_CRT_SECURE_NO_WARNINGS" /c /W3 /DWIN32_LEAN_AND_MEAN
//==============================================================================================
#include <windows.h>
#include "malloc.h"
#include "string.h"

int __cdecl _strnicmp(const char* str1, const char* str2, size_t count)
{
 size_t iLen1=strlen(str1);
 size_t iLen2=strlen(str2);
 if(count>iLen1)
    return -1;
 if(count>iLen2)
    return 1;
 char* pStr1=(char*)malloc(count+1);
 char* pStr2=(char*)malloc(count+1);
 strncpy(pStr1,str1,count);
 strncpy(pStr2,str2,count);
 pStr1[count]=0;
 pStr2[count]=0;
 int iReturn=lstrcmpiA(pStr1,pStr2);
 free(pStr1);
 free(pStr2);

 return iReturn;
}

int __cdecl _wcsnicmp(const wchar_t* str1, const wchar_t* str2, size_t count)
{
 size_t iLen1=wcslen(str1);
 size_t iLen2=wcslen(str2);
 if(count>iLen1)
    return -1;
 if(count>iLen2)
    return 1;
 wchar_t* pStr1=(wchar_t*)malloc(count*2+2);
 wchar_t* pStr2=(wchar_t*)malloc(count*2+2);
 wcsncpy(pStr1,str1,count);
 wcsncpy(pStr2,str2,count);
 pStr1[count]=0;
 pStr2[count]=0;
 int iReturn=lstrcmpiW(pStr1,pStr2);
 free(pStr1);
 free(pStr2);

 return iReturn;
}

Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//                 cl strncpy.cpp /O1 /Os /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
#include <windows.h>
#include "string.h"


char* __cdecl strncpy(char* strDest, const char* strSource, size_t iCount)   
{                                              // 3rd param is size_t
 size_t iLenSrc=strlen(strSource);             // strlen returns size_t                                           
 lstrcpynA(strDest,strSource,(int)iCount);     // lstrcpyn wants an int here for 3rd param                                           
 strDest[iCount-1]=strSource[iCount-1];        // so try cast
 if(iCount>iLenSrc)
 {
    for(size_t i=iLenSrc; i<iCount; i++)
        strDest[i]=0;
 }

 return strDest;
}


wchar_t* __cdecl wcsncpy(wchar_t* strDest, const wchar_t* strSource, size_t iCount)   
{                                           // 3rd param is size_t
 size_t iLen=wcslen(strSource);             // strlen returns size_t                                           
 lstrcpynW(strDest,strSource,(int)iCount);  // lstrcpyn wants an int here for 3rd param                                           
 strDest[iCount-1]=strSource[iCount-1];     // so try cast
 if(iCount>iLen)
 {
    for(size_t i=iLen; i<iCount; i++)
        strDest[i]=0;
 }

 return strDest;


Code: [Select]
//===============================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//       cl strncmp.cpp /D "_CRT_SECURE_NO_WARNINGS" /c /W3 /DWIN32_LEAN_AND_MEAN
//===============================================================================================
#include <windows.h>
#include "malloc.h"
#include "string.h"


int __cdecl strncmp(const char* str1, const char* str2, size_t count)
{
 size_t iLen1=strlen(str1);
 size_t iLen2=strlen(str2);
 if(count>iLen1)
    return -1;
 if(count>iLen2)
    return 1;
 char* pStr1=(char*)malloc(count+1);
 char* pStr2=(char*)malloc(count+1);
 strncpy(pStr1,str1,count);
 strncpy(pStr2,str2,count);
 pStr1[count]=0;
 pStr2[count]=0;
 int iReturn=strcmp(pStr1,pStr2);
 free(pStr1);
 free(pStr2);

 return iReturn;
}


int __cdecl wcsncmp(const wchar_t* str1, const wchar_t* str2, size_t count)
{
 size_t iLen1=wcslen(str1);
 size_t iLen2=wcslen(str2);
 if(count>iLen1)
    return -1;
 if(count>iLen2)
    return 1;
 wchar_t* pStr1=(wchar_t*)malloc(count*2+2);
 wchar_t* pStr2=(wchar_t*)malloc(count*2+2);
 wcsncpy(pStr1,str1,count);
 wcsncpy(pStr2,str2,count);
 pStr1[count]=0;
 pStr2[count]=0;
 int iReturn=wcscmp(pStr1,pStr2);
 free(pStr1);
 free(pStr2);

 return iReturn;
}

Code: [Select]
//========================================================
// Developed As An Addition To Matt Pietrek's LibCTiny.lib
//              By Fred Harris, January 2016
//
//     cl _strrev.cpp /Oi /c /W3 /DWIN32_LEAN_AND_MEAN
//========================================================
#include <windows.h>
#include "string.h"

char* __cdecl _strrev(char* pStr)
{
 size_t iLen,iHalf;
 char a,b;

 iLen=strlen(pStr), iHalf=iLen/2;
 for(size_t i=0; i<iHalf; i++)
 {
     a=pStr[i], b=pStr[iLen-i-1];
     pStr[i]=b, pStr[iLen-i-1]=a;
 }

 return pStr;
}

wchar_t* __cdecl _wcsrev(wchar_t* pStr)
{
 size_t iLen,iHalf;
 wchar_t a,b;

 iLen=wcslen(pStr), iHalf=iLen/2;
 for(size_t i=0; i<iHalf; i++)
 {
     a=pStr[i], b=pStr[iLen-i-1];
     pStr[i]=b, pStr[iLen-i-1]=a;
 }

 return pStr;
}


Code: [Select]
//===================================================================
//   Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                By Fred Harris, January 2016
//
//         cl strcat.cpp /c /W3 /DWIN32_LEAN_AND_MEAN
//===================================================================
#include <windows.h>
#include "string.h"

char* __cdecl strcat(char* strDest, const char* strSource)   
{                                           // 3rd param is size_t
 return lstrcatA(strDest, strSource);
}

wchar_t* __cdecl wcscat(wchar_t* strDest, const wchar_t* strSource)   
{                                           // 3rd param is size_t
 return lstrcatW(strDest, strSource);
}


Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//                      cl strcmp.cpp /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
#include <windows.h>
#include "string.h"

int __cdecl strcmp(const char* string1, const char* string2)   
{
 return lstrcmpA(string1,string2);
}

int __cdecl wcscmp(const wchar_t* string1, const wchar_t* string2)   
{
 return lstrcmpW(string1,string2);
}


Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//                      cl strcpy.cpp /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
// cl strcpy.cpp /c /W3 /DWIN32_LEAN_AND_MEAN
#include <windows.h>
#include <string.h>

char* __cdecl strcpy(char* strDestination, const char* strSource)
{
 return lstrcpyA(strDestination, strSource);
}

wchar_t* __cdecl wcscpy(wchar_t* strDestination, const wchar_t* strSource)
{
 return lstrcpyW(strDestination, strSource);
}


Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//                      cl strlen.cpp /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
#include <windows.h>
#include "string.h"

size_t __cdecl strlen(const char* strSource)
{
 return lstrlenA(strSource);
}

size_t __cdecl wcslen(const wchar_t* strSource)
{
 return lstrlenW(strSource);
}


Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//              cl getchar.cpp /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
#include <windows.h>
#include "stdio.h"

char __cdecl getchar()
{
 DWORD nRead,dwConsoleMode;
 INPUT_RECORD ir[8];
 bool blnLoop=true;
 HANDLE hStdIn;
 char c=0;

 hStdIn=GetStdHandle(STD_INPUT_HANDLE);
 GetConsoleMode(hStdIn,&dwConsoleMode);
 SetConsoleMode(hStdIn,0);
 FlushConsoleInputBuffer(hStdIn);
 do
 {
    WaitForSingleObject(hStdIn,INFINITE);
    ReadConsoleInput(hStdIn,ir,8,&nRead);
    for(unsigned i=0;i<nRead;i++)
    {
        if(ir[i].EventType==KEY_EVENT && ir[i].Event.KeyEvent.bKeyDown==TRUE)
        {
           c=ir[i].Event.KeyEvent.uChar.AsciiChar;
           blnLoop=false;
        }
    }
 }while(blnLoop==true);
 SetConsoleMode(hStdIn,dwConsoleMode);

 return c;
}

Code: [Select]
// cl alloc.cpp /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
#include <windows.h>
#include "malloc.h"

void* __cdecl malloc(size_t size)
{
 return HeapAlloc(GetProcessHeap(), 0, size);
}

void __cdecl free(void* pMem)
{
 HeapFree(GetProcessHeap(), 0, pMem);
}

Code: [Select]
// cl alloc2.cpp /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
#include <windows.h>
#include "malloc.h"

void* __cdecl realloc(void* pMem, size_t size)
{
 if(pMem)
    return HeapReAlloc(GetProcessHeap(), 0, pMem, size);
 else
    return HeapAlloc(GetProcessHeap(), 0, size);
}

void* __cdecl calloc(size_t nitems, size_t size)
{
 return HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, nitems * size);
}

Code: [Select]
// cl allocsup.cpp /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
#include <windows.h>
#include "malloc.h"

void* __cdecl _nh_malloc(size_t size, int nhFlag)
{
 return HeapAlloc(GetProcessHeap(), 0, size);
}

size_t __cdecl _msize(void* pMem)
{
 return HeapSize(GetProcessHeap(), 0, pMem);
}

Next one’s quite large!

Code: [Select]
//==============================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                             By Fred Harris, March 2016
//
// FltToCh and FltToWch take as input char or wchar_t pointer p, which points to a buffer with
// adequate size to contain the resulting converted character string.
//
//   cl FltToCh.cpp /D "_CRT_SECURE_NO_WARNINGS" /O1 /Os /GS- /c /W3 /DWIN32_LEAN_AND_MEAN
//==============================================================================================
#include <windows.h>
#ifdef _M_IX86
   #include <emmintrin.h>
#endif
#include "malloc.h"
#include "string.h"
#include "stdlib.h"
#include "memory.h"
#include "stdio.h"


#ifdef _M_IX86
   unsigned int DoubleToU32(double x)
   {
    return (unsigned int)_mm_cvttsd_si32(_mm_set_sd(x));
   }
#endif


size_t __cdecl FltToCh(char* p, double x, size_t iFldWthTChrs, int iDecPlaces, char cDecimalSeperator, bool blnRightJustified)
{
 bool blnPositive,blnAbsValLessThanOne;
 bool blnRoundUpSuccessful=false;
 bool blnNeedToRoundUp=false;
 int iRoundingDigitLocation;
 size_t n,i=0,k=0;
 char* p1;
 
 p1=(char*)malloc(32);
 if(!p1)
    return 0;
 memset(p1,0,32);
 p1[0]=32, p1[1]=32;
 p1=p1+2;
 if(x>=0.0L)
    blnPositive=true;
 else
 {
    blnPositive=false;
    x=x*-1.0L;
 }
 if(x<1.0L)
    blnAbsValLessThanOne=true;
 else
    blnAbsValLessThanOne=false;
 #ifdef _M_IX86 
    n=DoubleToU32(x);
 #else
    n=(size_t)x;     
 #endif
 while(n>0)
 {
   x=x/10;
   #ifdef _M_IX86
      n=DoubleToU32(x);
   #else   
      n=(size_t)x;
   #endif
   i++;
 }
 *(p1+i) = cDecimalSeperator;
 x=x*10;
 #ifdef _M_IX86
    n=DoubleToU32(x);
 #else   
    n = (size_t)x;
 #endif 
 x = x-n;
 while(k<=17)
 {
   if(k == i)
      k++;
   *(p1+k)=48+(char)n;
   x=x*10;
   #ifdef _M_IX86
      n=DoubleToU32(x);
   #else   
      n = (size_t)x;
   #endif   
   x = x-n;
   k++;
 }
 *(p1+k) = '\0';
 iRoundingDigitLocation=(int)i+iDecPlaces+1;
 if(p1[iRoundingDigitLocation]>53)
    blnNeedToRoundUp=true;
 else
 {
    if(p1[iRoundingDigitLocation]==53  && p1[iRoundingDigitLocation-1] % 2)
       blnNeedToRoundUp=true;
 }
 p1[iRoundingDigitLocation]=0;
 if(blnNeedToRoundUp)
 {
    int iStart=iRoundingDigitLocation-1;
    for(int h=iStart; h>=0; h--)
    {
        if(h==i)
           continue;
        else
        {
           if(p1[h]!=57 && p1[h]!=cDecimalSeperator)
           {
              p1[h]=p1[h]+1;
              blnRoundUpSuccessful=true;
              break;
           }
           else
              p1[h]=48;
        }
    }
 }
 if(blnPositive)
 {
    if(blnRoundUpSuccessful==false && blnNeedToRoundUp==true)
    {
       p1[-1]=49;
       if(blnAbsValLessThanOne)
          blnAbsValLessThanOne=false;
    }   
    if(iDecPlaces==0)
       p1[i]='\0';     
    if(blnAbsValLessThanOne)
       p1[-1]=48;
 }
 else
 {
    if(blnRoundUpSuccessful==false && blnNeedToRoundUp==true)
    {
       p1[-1]=49;
       p1[-2]='-';
       if(blnAbsValLessThanOne)
          blnAbsValLessThanOne=false;
    }   
    if(iDecPlaces==0)
       p1[i]=0;     
    if(blnAbsValLessThanOne)
    {
       p1[-1]=48;
       p1[-2]='-';
    }
    else
    {
       if(p1[-1]==32)     
          p1[-1]='-';
    }       
 }
 
 int iSpaces=0;
 p1=p1-2;
 for(int i=0; i<18; i++)
 {
     if(p1[i]==32)
        iSpaces++;
 }
 for(int i=0; i<18; i++)
     p1[i]=p1[i+iSpaces];
 size_t iLen=strlen(p1);
 if(iLen>(iFldWthTChrs-1))
 {
   memset(p,'F',iFldWthTChrs-1);
    p[iFldWthTChrs-1]=0;
    free(p1);
    return 0;
 }   
 char* pField=(char*)malloc(iFldWthTChrs);
 if(!pField)
 {
    free(p1);
    return 0;
 }
 size_t iDiff = iFldWthTChrs - iLen -1;
 if(blnRightJustified)
 {
    for(size_t i=0; i<iDiff; i++)
        pField[i]=32;
    pField[iDiff]=0; 
    strcat(pField,p1);
 }
 else
 {
    strcpy(pField,p1);
   memset(pField+iLen,' ',iDiff);
    pField[iFldWthTChrs-1]=0;
 }
 strcpy(p,pField);
 free(p1);
 free(pField);
 
 return iFldWthTChrs-1;
}


size_t __cdecl FltToWch(wchar_t* p, double x, size_t iFldWthTChrs, int iDecPlaces, wchar_t cDecimalSeperator, bool blnRightJustified)
{
 bool blnPositive,blnAbsValLessThanOne;
 bool blnRoundUpSuccessful=false;
 bool blnNeedToRoundUp=false;
 int iRoundingDigitLocation;
 size_t n,i=0,k=0;
 wchar_t* p1;
 
 p1=(wchar_t*)malloc(64);
 if(!p1)
    return 0;
 memset(p1,0,64);
 p1[0]=32, p1[1]=32;
 p1=p1+2;
 if(x>=0.0L)
    blnPositive=true;
 else
 {
    blnPositive=false;
    x=x*-1.0L;
 }
 if(x<1.00000000000000L)
    blnAbsValLessThanOne=true;
 else
    blnAbsValLessThanOne=false;
 #ifdef _M_IX86 
    n=DoubleToU32(x);
 #else   
    n=(size_t)x;
 #endif 
 while(n>0)
 {
   x=x/10;
   #ifdef _M_IX86
      n=DoubleToU32(x);
   #else
      n=(size_t)x;
   #endif 
   i++;
 }
 *(p1+i) = cDecimalSeperator;
 x=x*10;
 #ifdef _M_IX86
    n=DoubleToU32(x);
 #else
    n = (size_t)x;
 #endif 
 x = x-n;
 while(k<=17)
 {
   if(k == i)
      k++;
   *(p1+k)=48+(wchar_t)n;
   x=x*10;
   #ifdef _M_IX86
      n=DoubleToU32(x);
   #else
      n = (size_t)x;
   #endif 
   x = x-n;
   k++;
 }
 *(p1+k) = L'\0';
 iRoundingDigitLocation=(int)i+iDecPlaces+1;
 if(p1[iRoundingDigitLocation]>53)
    blnNeedToRoundUp=true;
 else
 {
    if(p1[iRoundingDigitLocation]==53  && p1[iRoundingDigitLocation-1] % 2)
       blnNeedToRoundUp=true;
 }
 p1[iRoundingDigitLocation]=0;
 if(blnNeedToRoundUp)
 {
    int iStart=iRoundingDigitLocation-1;
    for(int h=iStart; h>=0; h--)
    {
        if(h==i)
           continue;
        else
        {
           if(p1[h]!=57 && p1[h]!=cDecimalSeperator)
           {
              p1[h]=p1[h]+1;
              blnRoundUpSuccessful=true;
              break;
           }
           else
              p1[h]=48;
        }
    }
 }
 if(blnPositive)
 {
    if(blnRoundUpSuccessful==false && blnNeedToRoundUp==true)
    {
       p1[-1]=49;
       if(blnAbsValLessThanOne)
          blnAbsValLessThanOne=false;
    }   
    if(iDecPlaces==0)
       p1[i]=L'\0';     
    if(blnAbsValLessThanOne)
       p1[-1]=48;
 }
 else
 {
    if(blnRoundUpSuccessful==false && blnNeedToRoundUp==true)
    {
       p1[-1]=49;
       p1[-2]=L'-';
       if(blnAbsValLessThanOne)
          blnAbsValLessThanOne=false;
    }   
    if(iDecPlaces==0)
       p1[i]=0;     
    if(blnAbsValLessThanOne)
    {
       p1[-1]=48;
       p1[-2]=L'-';
    }
    else
    {
       if(p1[-1]==32)     
          p1[-1]=L'-';
    }       
 }
 
 int iSpaces=0;
 p1=p1-2;
 for(int i=0; i<18; i++)
 {
     if(p1[i]==32)
        iSpaces++;
 }
 for(int i=0; i<18; i++)
     p1[i]=p1[i+iSpaces];
 size_t iLen=wcslen(p1);
 if(iLen>(iFldWthTChrs-1))
 {
   wmemset(p,L'F',iFldWthTChrs-1);
    p[iFldWthTChrs-1]=0;
    free(p1);
    return 0;
 }   
 wchar_t* pField=(wchar_t*)malloc(iFldWthTChrs*2);
 if(!pField)
 {
    free(p1);
    return 0;
 }
 size_t iDiff = iFldWthTChrs - iLen -1;
 if(blnRightJustified)
 {
    for(size_t i=0; i<iDiff; i++)
        pField[i]=32;
    pField[iDiff]=0; 
    wcscat(pField,p1);
 }
 else
 {
    wcscpy(pField,p1);
   wmemset(pField+iLen,L' ',iDiff);
   pField[iFldWthTChrs-1]=0;
 }
 wcscpy(p,pField);
 free(p1);
 free(pField);
 
 return iFldWthTChrs-1;
}

Code: [Select]
//==============================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                             By Fred Harris, March 2016
//
//        cl atol.cpp /D "_CRT_SECURE_NO_WARNINGS" /c /W3 /DWIN32_LEAN_AND_MEAN
//==============================================================================================
#include "stdlib.h"


long __cdecl atol(const char* pStr)
{
 char c,cNeg=NULL;           // c holds the char; cNeg holds the '-' sign.
 long lTotal=0;              // The running total.

 while(*pStr==32 || *pStr==8)
    pStr++; 
 if(*pStr=='-')
 {
    cNeg='-';
    pStr++;
 }
 while(*pStr)
 {
    c=*pStr++;
    lTotal=10*lTotal+(c-48); // Add this digit to the total.
 }
 if(cNeg=='-')               // If we have a negative sign, convert the value.
    lTotal=-lTotal;
 
 return lTotal;
}


int __cdecl atoi (const char* pStr)
{
 return ((int)atol(pStr));
}


long __cdecl _wtol(const wchar_t* pStr)
{
 wchar_t c,cNeg=NULL;        // c holds the char; cNeg holds the '-' sign.
 long lTotal=0;              // The running total.

 while(*pStr==32 || *pStr==8)
    pStr++; 
 if(*pStr==L'-')
 {
    cNeg=L'-';
    pStr++;
 }
 while(*pStr)
 {
    c=*pStr++;
    lTotal=10*lTotal+(c-48); // Add this digit to the total.
 }
 if(cNeg==L'-')              // If we have a negative sign, convert the value.
    lTotal=-lTotal;
 
 return lTotal;
}


Code: [Select]
//==============================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                             By Fred Harris, March 2016
//
//       cl _atoi64.cpp /D "_CRT_SECURE_NO_WARNINGS" /c /W3 /DWIN32_LEAN_AND_MEAN
//==============================================================================================
#include "stdlib.h"


_int64 __cdecl _atoi64(const char* pStr)
{
 char c,cNeg=NULL;           // c holds the char; cNeg holds the '-' sign.
 _int64 lTotal=0;            // The running total.

 while(*pStr==32 || *pStr==8)
    pStr++; 
 if(*pStr=='-')
 {
    cNeg='-';
    pStr++;
 }
 while(*pStr)
 {
    c=*pStr++;
    lTotal=10*lTotal+(c-48); // Add this digit to the total.
 }
 if(cNeg=='-')               // If we have a negative sign, convert the value.
    lTotal=-lTotal;
 
 return lTotal;
}


_int64 __cdecl _wtoi64(const wchar_t* pStr)
{
 wchar_t c,cNeg=NULL;        // c holds the char; cNeg holds the '-' sign.
 _int64 lTotal=0;            // The running total.

 while(*pStr==32 || *pStr==8)
    pStr++; 
 if(*pStr==L'-')
 {
    cNeg=L'-';
    pStr++;
 }
 while(*pStr)
 {
    c=*pStr++;
    lTotal=10*lTotal+(c-48); // Add this digit to the total.
 }
 if(cNeg==L'-')              // If we have a negative sign, convert the value.
    lTotal=-lTotal;
 
 return lTotal;
}


Code: [Select]
//==============================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                             By Fred Harris, March 2016
//
//         cl abs.cpp /D "_CRT_SECURE_NO_WARNINGS" /c /W3 /DWIN32_LEAN_AND_MEAN
//==============================================================================================
#include "stdlib.h"

int __cdecl abs(int n)
{
 if(n<0)
    return -n;
 else
    return n;
}
 
_int64 __cdecl _abs64(__int64 n)
{
 if(n<0)
    return -n;
 else
    return n;
}

long labs(long n)
{
 if(n<0)
    return -n;
 else
    return n;
}


Code: [Select]
//=====================================================================================
//               Developed As An Addition To Matt Pietrek's LibCTiny.lib
//                            By Fred Harris, January 2016
//
//                     cl memcpy.cpp /c /W3 /DWIN32_LEAN_AND_MEAN
//=====================================================================================
//#include "memory.h"   // Don't Forward Declare memcpy.  Gives Problems!

extern "C" void* __cdecl memcpy(void* pDest, void* pSrc, size_t iCount)
{
 char* pDestination=(char*)pDest;
 char* pSource=(char*)pSrc;

 for(size_t i=0; i<iCount; i++)
     pDestination[i]=pSource[i];

 return pDest;
}

I did not develop the code below in win32_crt_math.cpp.  It was developed by Martins Mozeiko and posted at ….

https://hero.handmadedev.org/forum/code-discussion/79-guide-how-to-avoid-c-c-runtime-on-windows

It is only used in 32 bit architecture.  See _M_IX86 #define below.  Martins basically took the asm source from the …\crt\src subdirectories of VC and modified it so it would function within C++ naked functions, which are functions (if you can call them that) without prolog or epilog stack setup code.  These functions are only used by the compiler for various floating point math manipulations.

continued...

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #5 on: March 23, 2016, 01:29:04 AM »
// win32_crt_math.cpp
Code: [Select]
// C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link /subsystem:windows /nodefaultlib kernel32.lib
// /entry:wWinMainCRTStartup /subsystem:windows
// cl win32_crt_math.cpp  /Gm- /GR- /EHa /Oi /GS- /c /W3 /DWIN32_LEAN_AND_MEAN /link /subsystem:windows /nodefaultlib kernel32.lib //this worked
// cl win32_crt_math.cpp  /Gm- /GR- /EHa /Oi /GS- /c /W3 /DWIN32_LEAN_AND_MEAN        //this worked too!
// win32_crt_math.cpp

#ifdef _M_IX86 // use this file only for 32-bit architecture

#define CRT_LOWORD(x) dword ptr [x+0]
#define CRT_HIWORD(x) dword ptr [x+4]

extern "C"
{
    __declspec(naked) void _alldiv()
    {
        #define DVND    esp + 16      // stack address of dividend (a)
        #define DVSR    esp + 24      // stack address of divisor (b)

        __asm
        {
        push    edi
        push    esi
        push    ebx

; Determine sign of the result (edi = 0 if result is positive, non-zero
; otherwise) and make operands positive.

        xor     edi,edi         ; result sign assumed positive

        mov     eax,CRT_HIWORD(DVND) ; hi word of a
        or      eax,eax         ; test to see if signed
        jge     short L1        ; skip rest if a is already positive
        inc     edi             ; complement result sign flag
        mov     edx,CRT_LOWORD(DVND) ; lo word of a
        neg     eax             ; make a positive
        neg     edx
        sbb     eax,0
        mov     CRT_HIWORD(DVND),eax ; save positive value
        mov     CRT_LOWORD(DVND),edx
L1:
        mov     eax,CRT_HIWORD(DVSR) ; hi word of b
        or      eax,eax         ; test to see if signed
        jge     short L2        ; skip rest if b is already positive
        inc     edi             ; complement the result sign flag
        mov     edx,CRT_LOWORD(DVSR) ; lo word of a
        neg     eax             ; make b positive
        neg     edx
        sbb     eax,0
        mov     CRT_HIWORD(DVSR),eax ; save positive value
        mov     CRT_LOWORD(DVSR),edx
L2:

;
; Now do the divide.  First look to see if the divisor is less than 4194304K.
; If so, then we can use a simple algorithm with word divides, otherwise
; things get a little more complex.
;
; NOTE - eax currently contains the high order word of DVSR
;

        or      eax,eax         ; check to see if divisor < 4194304K
        jnz     short L3        ; nope, gotta do this the hard way
        mov     ecx,CRT_LOWORD(DVSR) ; load divisor
        mov     eax,CRT_HIWORD(DVND) ; load high word of dividend
        xor     edx,edx
        div     ecx             ; eax <- high order bits of quotient
        mov     ebx,eax         ; save high bits of quotient
        mov     eax,CRT_LOWORD(DVND) ; edx:eax <- remainder:lo word of dividend
        div     ecx             ; eax <- low order bits of quotient
        mov     edx,ebx         ; edx:eax <- quotient
        jmp     short L4        ; set sign, restore stack and return

;
; Here we do it the hard way.  Remember, eax contains the high word of DVSR
;

L3:
        mov     ebx,eax         ; ebx:ecx <- divisor
        mov     ecx,CRT_LOWORD(DVSR)
        mov     edx,CRT_HIWORD(DVND) ; edx:eax <- dividend
        mov     eax,CRT_LOWORD(DVND)
L5:
        shr     ebx,1           ; shift divisor right one bit
        rcr     ecx,1
        shr     edx,1           ; shift dividend right one bit
        rcr     eax,1
        or      ebx,ebx
        jnz     short L5        ; loop until divisor < 4194304K
        div     ecx             ; now divide, ignore remainder
        mov     esi,eax         ; save quotient

;
; We may be off by one, so to check, we will multiply the quotient
; by the divisor and check the result against the orignal dividend
; Note that we must also check for overflow, which can occur if the
; dividend is close to 2**64 and the quotient is off by 1.
;

        mul     CRT_HIWORD(DVSR) ; QUOT * CRT_HIWORD(DVSR)
        mov     ecx,eax
        mov     eax,CRT_LOWORD(DVSR)
        mul     esi             ; QUOT * CRT_LOWORD(DVSR)
        add     edx,ecx         ; EDX:EAX = QUOT * DVSR
        jc      short L6        ; carry means Quotient is off by 1

;
; do long compare here between original dividend and the result of the
; multiply in edx:eax.  If original is larger or equal, we are ok, otherwise
; subtract one (1) from the quotient.
;

        cmp     edx,CRT_HIWORD(DVND) ; compare hi words of result and original
        ja      short L6        ; if result > original, do subtract
        jb      short L7        ; if result < original, we are ok
        cmp     eax,CRT_LOWORD(DVND) ; hi words are equal, compare lo words
        jbe     short L7        ; if less or equal we are ok, else subtract
L6:
        dec     esi             ; subtract 1 from quotient
L7:
        xor     edx,edx         ; edx:eax <- quotient
        mov     eax,esi

;
; Just the cleanup left to do.  edx:eax contains the quotient.  Set the sign
; according to the save value, cleanup the stack, and return.
;

L4:
        dec     edi             ; check to see if result is negative
        jnz     short L8        ; if EDI == 0, result should be negative
        neg     edx             ; otherwise, negate the result
        neg     eax
        sbb     edx,0

;
; Restore the saved registers and return.
;

L8:
        pop     ebx
        pop     esi
        pop     edi

        ret     16
        }

        #undef DVND
        #undef DVSR
    }

    __declspec(naked) void _alldvrm()
    {
        #define DVND    esp + 16      // stack address of dividend (a)
        #define DVSR    esp + 24      // stack address of divisor (b)

        __asm
        {
        push    edi
        push    esi
        push    ebp

; Determine sign of the quotient (edi = 0 if result is positive, non-zero
; otherwise) and make operands positive.
; Sign of the remainder is kept in ebp.

        xor     edi,edi         ; result sign assumed positive
        xor     ebp,ebp         ; result sign assumed positive

        mov     eax,CRT_HIWORD(DVND) ; hi word of a
        or      eax,eax         ; test to see if signed
        jge     short L1        ; skip rest if a is already positive
        inc     edi             ; complement result sign flag
        inc     ebp             ; complement result sign flag
        mov     edx,CRT_LOWORD(DVND) ; lo word of a
        neg     eax             ; make a positive
        neg     edx
        sbb     eax,0
        mov     CRT_HIWORD(DVND),eax ; save positive value
        mov     CRT_LOWORD(DVND),edx
L1:
        mov     eax,CRT_HIWORD(DVSR) ; hi word of b
        or      eax,eax         ; test to see if signed
        jge     short L2        ; skip rest if b is already positive
        inc     edi             ; complement the result sign flag
        mov     edx,CRT_LOWORD(DVSR) ; lo word of a
        neg     eax             ; make b positive
        neg     edx
        sbb     eax,0
        mov     CRT_HIWORD(DVSR),eax ; save positive value
        mov     CRT_LOWORD(DVSR),edx
L2:

;
; Now do the divide.  First look to see if the divisor is less than 4194304K.
; If so, then we can use a simple algorithm with word divides, otherwise
; things get a little more complex.
;
; NOTE - eax currently contains the high order word of DVSR
;

        or      eax,eax         ; check to see if divisor < 4194304K
        jnz     short L3        ; nope, gotta do this the hard way
        mov     ecx,CRT_LOWORD(DVSR) ; load divisor
        mov     eax,CRT_HIWORD(DVND) ; load high word of dividend
        xor     edx,edx
        div     ecx             ; eax <- high order bits of quotient
        mov     ebx,eax         ; save high bits of quotient
        mov     eax,CRT_LOWORD(DVND) ; edx:eax <- remainder:lo word of dividend
        div     ecx             ; eax <- low order bits of quotient
        mov     esi,eax         ; ebx:esi <- quotient
;
; Now we need to do a multiply so that we can compute the remainder.
;
        mov     eax,ebx         ; set up high word of quotient
        mul     CRT_LOWORD(DVSR) ; CRT_HIWORD(QUOT) * DVSR
        mov     ecx,eax         ; save the result in ecx
        mov     eax,esi         ; set up low word of quotient
        mul     CRT_LOWORD(DVSR) ; CRT_LOWORD(QUOT) * DVSR
        add     edx,ecx         ; EDX:EAX = QUOT * DVSR
        jmp     short L4        ; complete remainder calculation

;
; Here we do it the hard way.  Remember, eax contains the high word of DVSR
;

L3:
        mov     ebx,eax         ; ebx:ecx <- divisor
        mov     ecx,CRT_LOWORD(DVSR)
        mov     edx,CRT_HIWORD(DVND) ; edx:eax <- dividend
        mov     eax,CRT_LOWORD(DVND)
L5:
        shr     ebx,1           ; shift divisor right one bit
        rcr     ecx,1
        shr     edx,1           ; shift dividend right one bit
        rcr     eax,1
        or      ebx,ebx
        jnz     short L5        ; loop until divisor < 4194304K
        div     ecx             ; now divide, ignore remainder
        mov     esi,eax         ; save quotient

;
; We may be off by one, so to check, we will multiply the quotient
; by the divisor and check the result against the orignal dividend
; Note that we must also check for overflow, which can occur if the
; dividend is close to 2**64 and the quotient is off by 1.
;

        mul     CRT_HIWORD(DVSR) ; QUOT * CRT_HIWORD(DVSR)
        mov     ecx,eax
        mov     eax,CRT_LOWORD(DVSR)
        mul     esi             ; QUOT * CRT_LOWORD(DVSR)
        add     edx,ecx         ; EDX:EAX = QUOT * DVSR
        jc      short L6        ; carry means Quotient is off by 1

;
; do long compare here between original dividend and the result of the
; multiply in edx:eax.  If original is larger or equal, we are ok, otherwise
; subtract one (1) from the quotient.
;

        cmp     edx,CRT_HIWORD(DVND) ; compare hi words of result and original
        ja      short L6        ; if result > original, do subtract
        jb      short L7        ; if result < original, we are ok
        cmp     eax,CRT_LOWORD(DVND) ; hi words are equal, compare lo words
        jbe     short L7        ; if less or equal we are ok, else subtract
L6:
        dec     esi             ; subtract 1 from quotient
        sub     eax,CRT_LOWORD(DVSR) ; subtract divisor from result
        sbb     edx,CRT_HIWORD(DVSR)
L7:
        xor     ebx,ebx         ; ebx:esi <- quotient

L4:
;
; Calculate remainder by subtracting the result from the original dividend.
; Since the result is already in a register, we will do the subtract in the
; opposite direction and negate the result if necessary.
;

        sub     eax,CRT_LOWORD(DVND) ; subtract dividend from result
        sbb     edx,CRT_HIWORD(DVND)

;
; Now check the result sign flag to see if the result is supposed to be positive
; or negative.  It is currently negated (because we subtracted in the 'wrong'
; direction), so if the sign flag is set we are done, otherwise we must negate
; the result to make it positive again.
;

        dec     ebp             ; check result sign flag
        jns     short L9        ; result is ok, set up the quotient
        neg     edx             ; otherwise, negate the result
        neg     eax
        sbb     edx,0

;
; Now we need to get the quotient into edx:eax and the remainder into ebx:ecx.
;
L9:
        mov     ecx,edx
        mov     edx,ebx
        mov     ebx,ecx
        mov     ecx,eax
        mov     eax,esi

;
; Just the cleanup left to do.  edx:eax contains the quotient.  Set the sign
; according to the save value, cleanup the stack, and return.
;

        dec     edi             ; check to see if result is negative
        jnz     short L8        ; if EDI == 0, result should be negative
        neg     edx             ; otherwise, negate the result
        neg     eax
        sbb     edx,0

;
; Restore the saved registers and return.
;

L8:
        pop     ebp
        pop     esi
        pop     edi

        ret     16
        }

        #undef DVND
        #undef DVSR
    }

    __declspec(naked) void _allmul()
    {
        #define A       esp + 8       // stack address of a
        #define B       esp + 16      // stack address of b

        __asm
        {
        push    ebx

        mov     eax,CRT_HIWORD(A)
        mov     ecx,CRT_LOWORD(B)
        mul     ecx             ;eax has AHI, ecx has BLO, so AHI * BLO
        mov     ebx,eax         ;save result

        mov     eax,CRT_LOWORD(A)
        mul     CRT_HIWORD(B)       ;ALO * BHI
        add     ebx,eax         ;ebx = ((ALO * BHI) + (AHI * BLO))

        mov     eax,CRT_LOWORD(A)   ;ecx = BLO
        mul     ecx             ;so edx:eax = ALO*BLO
        add     edx,ebx         ;now edx has all the LO*HI stuff

        pop     ebx

        ret     16              ; callee restores the stack
        }

        #undef A
        #undef B
    }

    __declspec(naked) void _allrem()
    {
        #define DVND    esp + 12      // stack address of dividend (a)
        #define DVSR    esp + 20      // stack address of divisor (b)

        __asm
        {
        push    ebx
        push    edi


; Determine sign of the result (edi = 0 if result is positive, non-zero
; otherwise) and make operands positive.

        xor     edi,edi         ; result sign assumed positive

        mov     eax,CRT_HIWORD(DVND) ; hi word of a
        or      eax,eax         ; test to see if signed
        jge     short L1        ; skip rest if a is already positive
        inc     edi             ; complement result sign flag bit
        mov     edx,CRT_LOWORD(DVND) ; lo word of a
        neg     eax             ; make a positive
        neg     edx
        sbb     eax,0
        mov     CRT_HIWORD(DVND),eax ; save positive value
        mov     CRT_LOWORD(DVND),edx
L1:
        mov     eax,CRT_HIWORD(DVSR) ; hi word of b
        or      eax,eax         ; test to see if signed
        jge     short L2        ; skip rest if b is already positive
        mov     edx,CRT_LOWORD(DVSR) ; lo word of b
        neg     eax             ; make b positive
        neg     edx
        sbb     eax,0
        mov     CRT_HIWORD(DVSR),eax ; save positive value
        mov     CRT_LOWORD(DVSR),edx
L2:

;
; Now do the divide.  First look to see if the divisor is less than 4194304K.
; If so, then we can use a simple algorithm with word divides, otherwise
; things get a little more complex.
;
; NOTE - eax currently contains the high order word of DVSR
;

        or      eax,eax         ; check to see if divisor < 4194304K
        jnz     short L3        ; nope, gotta do this the hard way
        mov     ecx,CRT_LOWORD(DVSR) ; load divisor
        mov     eax,CRT_HIWORD(DVND) ; load high word of dividend
        xor     edx,edx
        div     ecx             ; edx <- remainder
        mov     eax,CRT_LOWORD(DVND) ; edx:eax <- remainder:lo word of dividend
        div     ecx             ; edx <- final remainder
        mov     eax,edx         ; edx:eax <- remainder
        xor     edx,edx
        dec     edi             ; check result sign flag
        jns     short L4        ; negate result, restore stack and return
        jmp     short L8        ; result sign ok, restore stack and return

;
; Here we do it the hard way.  Remember, eax contains the high word of DVSR
;

L3:
        mov     ebx,eax         ; ebx:ecx <- divisor
        mov     ecx,CRT_LOWORD(DVSR)
        mov     edx,CRT_HIWORD(DVND) ; edx:eax <- dividend
        mov     eax,CRT_LOWORD(DVND)
L5:
        shr     ebx,1           ; shift divisor right one bit
        rcr     ecx,1
        shr     edx,1           ; shift dividend right one bit
        rcr     eax,1
        or      ebx,ebx
        jnz     short L5        ; loop until divisor < 4194304K
        div     ecx             ; now divide, ignore remainder

;
; We may be off by one, so to check, we will multiply the quotient
; by the divisor and check the result against the orignal dividend
; Note that we must also check for overflow, which can occur if the
; dividend is close to 2**64 and the quotient is off by 1.
;

        mov     ecx,eax         ; save a copy of quotient in ECX
        mul     CRT_HIWORD(DVSR)
        xchg    ecx,eax         ; save product, get quotient in EAX
        mul     CRT_LOWORD(DVSR)
        add     edx,ecx         ; EDX:EAX = QUOT * DVSR
        jc      short L6        ; carry means Quotient is off by 1

;
; do long compare here between original dividend and the result of the
; multiply in edx:eax.  If original is larger or equal, we are ok, otherwise
; subtract the original divisor from the result.
;

        cmp     edx,CRT_HIWORD(DVND) ; compare hi words of result and original
        ja      short L6        ; if result > original, do subtract
        jb      short L7        ; if result < original, we are ok
        cmp     eax,CRT_LOWORD(DVND) ; hi words are equal, compare lo words
        jbe     short L7        ; if less or equal we are ok, else subtract
L6:
        sub     eax,CRT_LOWORD(DVSR) ; subtract divisor from result
        sbb     edx,CRT_HIWORD(DVSR)
L7:

;
; Calculate remainder by subtracting the result from the original dividend.
; Since the result is already in a register, we will do the subtract in the
; opposite direction and negate the result if necessary.
;

        sub     eax,CRT_LOWORD(DVND) ; subtract dividend from result
        sbb     edx,CRT_HIWORD(DVND)

;
; Now check the result sign flag to see if the result is supposed to be positive
; or negative.  It is currently negated (because we subtracted in the 'wrong'
; direction), so if the sign flag is set we are done, otherwise we must negate
; the result to make it positive again.
;

        dec     edi             ; check result sign flag
        jns     short L8        ; result is ok, restore stack and return
L4:
        neg     edx             ; otherwise, negate the result
        neg     eax
        sbb     edx,0

;
; Just the cleanup left to do.  edx:eax contains the quotient.
; Restore the saved registers and return.
;

L8:
        pop     edi
        pop     ebx

        ret     16
        }

        #undef DVND
        #undef DVSR
    }

    __declspec(naked) void _allshl()
    {
        __asm
        {
;
; Handle shifts of 64 or more bits (all get 0)
;
        cmp     cl, 64
        jae     short RETZERO

;
; Handle shifts of between 0 and 31 bits
;
        cmp     cl, 32
        jae     short MORE32
        shld    edx,eax,cl
        shl     eax,cl
        ret

;
; Handle shifts of between 32 and 63 bits
;
MORE32:
        mov     edx,eax
        xor     eax,eax
        and     cl,31
        shl     edx,cl
        ret

;
; return 0 in edx:eax
;
RETZERO:
        xor     eax,eax
        xor     edx,edx
        ret
        }
    }

    __declspec(naked) void _allshr()
    {
        __asm
        {
;
; Handle shifts of 64 bits or more (if shifting 64 bits or more, the result
; depends only on the high order bit of edx).
;
        cmp     cl,64
        jae     short RETSIGN

;
; Handle shifts of between 0 and 31 bits
;
        cmp     cl, 32
        jae     short MORE32
        shrd    eax,edx,cl
        sar     edx,cl
        ret

;
; Handle shifts of between 32 and 63 bits
;
MORE32:
        mov     eax,edx
        sar     edx,31
        and     cl,31
        sar     eax,cl
        ret

;
; Return double precision 0 or -1, depending on the sign of edx
;
RETSIGN:
        sar     edx,31
        mov     eax,edx
        ret
        }
    }

    __declspec(naked) void _aulldiv()
    {
        #define DVND    esp + 12      // stack address of dividend (a)
        #define DVSR    esp + 20      // stack address of divisor (b)

        __asm
        {
        push    ebx
        push    esi

;
; Now do the divide.  First look to see if the divisor is less than 4194304K.
; If so, then we can use a simple algorithm with word divides, otherwise
; things get a little more complex.
;

        mov     eax,CRT_HIWORD(DVSR) ; check to see if divisor < 4194304K
        or      eax,eax
        jnz     short L1        ; nope, gotta do this the hard way
        mov     ecx,CRT_LOWORD(DVSR) ; load divisor
        mov     eax,CRT_HIWORD(DVND) ; load high word of dividend
        xor     edx,edx
        div     ecx             ; get high order bits of quotient
        mov     ebx,eax         ; save high bits of quotient
        mov     eax,CRT_LOWORD(DVND) ; edx:eax <- remainder:lo word of dividend
        div     ecx             ; get low order bits of quotient
        mov     edx,ebx         ; edx:eax <- quotient hi:quotient lo
        jmp     short L2        ; restore stack and return

;
; Here we do it the hard way.  Remember, eax contains DVSRHI
;

L1:
        mov     ecx,eax         ; ecx:ebx <- divisor
        mov     ebx,CRT_LOWORD(DVSR)
        mov     edx,CRT_HIWORD(DVND) ; edx:eax <- dividend
        mov     eax,CRT_LOWORD(DVND)
L3:
        shr     ecx,1           ; shift divisor right one bit; hi bit <- 0
        rcr     ebx,1
        shr     edx,1           ; shift dividend right one bit; hi bit <- 0
        rcr     eax,1
        or      ecx,ecx
        jnz     short L3        ; loop until divisor < 4194304K
        div     ebx             ; now divide, ignore remainder
        mov     esi,eax         ; save quotient

;
; We may be off by one, so to check, we will multiply the quotient
; by the divisor and check the result against the orignal dividend
; Note that we must also check for overflow, which can occur if the
; dividend is close to 2**64 and the quotient is off by 1.
;

        mul     CRT_HIWORD(DVSR) ; QUOT * CRT_HIWORD(DVSR)
        mov     ecx,eax
        mov     eax,CRT_LOWORD(DVSR)
        mul     esi             ; QUOT * CRT_LOWORD(DVSR)
        add     edx,ecx         ; EDX:EAX = QUOT * DVSR
        jc      short L4        ; carry means Quotient is off by 1

;
; do long compare here between original dividend and the result of the
; multiply in edx:eax.  If original is larger or equal, we are ok, otherwise
; subtract one (1) from the quotient.
;

        cmp     edx,CRT_HIWORD(DVND) ; compare hi words of result and original
        ja      short L4        ; if result > original, do subtract
        jb      short L5        ; if result < original, we are ok
        cmp     eax,CRT_LOWORD(DVND) ; hi words are equal, compare lo words
        jbe     short L5        ; if less or equal we are ok, else subtract
L4:
        dec     esi             ; subtract 1 from quotient
L5:
        xor     edx,edx         ; edx:eax <- quotient
        mov     eax,esi

;
; Just the cleanup left to do.  edx:eax contains the quotient.
; Restore the saved registers and return.
;

L2:

        pop     esi
        pop     ebx

        ret     16
        }

        #undef DVND
        #undef DVSR
    }

    __declspec(naked) void _aulldvrm()
    {
        #define DVND    esp + 8       // stack address of dividend (a)
        #define DVSR    esp + 16      // stack address of divisor (b)

        __asm
        {
        push    esi

;
; Now do the divide.  First look to see if the divisor is less than 4194304K.
; If so, then we can use a simple algorithm with word divides, otherwise
; things get a little more complex.
;

        mov     eax,CRT_HIWORD(DVSR) ; check to see if divisor < 4194304K
        or      eax,eax
        jnz     short L1        ; nope, gotta do this the hard way
        mov     ecx,CRT_LOWORD(DVSR) ; load divisor
        mov     eax,CRT_HIWORD(DVND) ; load high word of dividend
        xor     edx,edx
        div     ecx             ; get high order bits of quotient
        mov     ebx,eax         ; save high bits of quotient
        mov     eax,CRT_LOWORD(DVND) ; edx:eax <- remainder:lo word of dividend
        div     ecx             ; get low order bits of quotient
        mov     esi,eax         ; ebx:esi <- quotient

;
; Now we need to do a multiply so that we can compute the remainder.
;
        mov     eax,ebx         ; set up high word of quotient
        mul     CRT_LOWORD(DVSR) ; CRT_HIWORD(QUOT) * DVSR
        mov     ecx,eax         ; save the result in ecx
        mov     eax,esi         ; set up low word of quotient
        mul     CRT_LOWORD(DVSR) ; CRT_LOWORD(QUOT) * DVSR
        add     edx,ecx         ; EDX:EAX = QUOT * DVSR
        jmp     short L2        ; complete remainder calculation

;
; Here we do it the hard way.  Remember, eax contains DVSRHI
;

L1:
        mov     ecx,eax         ; ecx:ebx <- divisor
        mov     ebx,CRT_LOWORD(DVSR)
        mov     edx,CRT_HIWORD(DVND) ; edx:eax <- dividend
        mov     eax,CRT_LOWORD(DVND)
L3:
        shr     ecx,1           ; shift divisor right one bit; hi bit <- 0
        rcr     ebx,1
        shr     edx,1           ; shift dividend right one bit; hi bit <- 0
        rcr     eax,1
        or      ecx,ecx
        jnz     short L3        ; loop until divisor < 4194304K
        div     ebx             ; now divide, ignore remainder
        mov     esi,eax         ; save quotient

;
; We may be off by one, so to check, we will multiply the quotient
; by the divisor and check the result against the orignal dividend
; Note that we must also check for overflow, which can occur if the
; dividend is close to 2**64 and the quotient is off by 1.
;

        mul     CRT_HIWORD(DVSR) ; QUOT * CRT_HIWORD(DVSR)
        mov     ecx,eax
        mov     eax,CRT_LOWORD(DVSR)
        mul     esi             ; QUOT * CRT_LOWORD(DVSR)
        add     edx,ecx         ; EDX:EAX = QUOT * DVSR
        jc      short L4        ; carry means Quotient is off by 1

;
; do long compare here between original dividend and the result of the
; multiply in edx:eax.  If original is larger or equal, we are ok, otherwise
; subtract one (1) from the quotient.
;

        cmp     edx,CRT_HIWORD(DVND) ; compare hi words of result and original
        ja      short L4        ; if result > original, do subtract
        jb      short L5        ; if result < original, we are ok
        cmp     eax,CRT_LOWORD(DVND) ; hi words are equal, compare lo words
        jbe     short L5        ; if less or equal we are ok, else subtract
L4:
        dec     esi             ; subtract 1 from quotient
        sub     eax,CRT_LOWORD(DVSR) ; subtract divisor from result
        sbb     edx,CRT_HIWORD(DVSR)
L5:
        xor     ebx,ebx         ; ebx:esi <- quotient

L2:
;
; Calculate remainder by subtracting the result from the original dividend.
; Since the result is already in a register, we will do the subtract in the
; opposite direction and negate the result.
;

        sub     eax,CRT_LOWORD(DVND) ; subtract dividend from result
        sbb     edx,CRT_HIWORD(DVND)
        neg     edx             ; otherwise, negate the result
        neg     eax
        sbb     edx,0

;
; Now we need to get the quotient into edx:eax and the remainder into ebx:ecx.
;
        mov     ecx,edx
        mov     edx,ebx
        mov     ebx,ecx
        mov     ecx,eax
        mov     eax,esi
;
; Just the cleanup left to do.  edx:eax contains the quotient.
; Restore the saved registers and return.
;

        pop     esi

        ret     16
        }

        #undef DVND
        #undef DVSR
    }

    __declspec(naked) void _aullrem()
    {
        #define DVND    esp + 8       // stack address of dividend (a)
        #define DVSR    esp + 16      // stack address of divisor (b)

        __asm
        {
        push    ebx

; Now do the divide.  First look to see if the divisor is less than 4194304K.
; If so, then we can use a simple algorithm with word divides, otherwise
; things get a little more complex.
;

        mov     eax,CRT_HIWORD(DVSR) ; check to see if divisor < 4194304K
        or      eax,eax
        jnz     short L1        ; nope, gotta do this the hard way
        mov     ecx,CRT_LOWORD(DVSR) ; load divisor
        mov     eax,CRT_HIWORD(DVND) ; load high word of dividend
        xor     edx,edx
        div     ecx             ; edx <- remainder, eax <- quotient
        mov     eax,CRT_LOWORD(DVND) ; edx:eax <- remainder:lo word of dividend
        div     ecx             ; edx <- final remainder
        mov     eax,edx         ; edx:eax <- remainder
        xor     edx,edx
        jmp     short L2        ; restore stack and return

;
; Here we do it the hard way.  Remember, eax contains DVSRHI
;

L1:
        mov     ecx,eax         ; ecx:ebx <- divisor
        mov     ebx,CRT_LOWORD(DVSR)
        mov     edx,CRT_HIWORD(DVND) ; edx:eax <- dividend
        mov     eax,CRT_LOWORD(DVND)
L3:
        shr     ecx,1           ; shift divisor right one bit; hi bit <- 0
        rcr     ebx,1
        shr     edx,1           ; shift dividend right one bit; hi bit <- 0
        rcr     eax,1
        or      ecx,ecx
        jnz     short L3        ; loop until divisor < 4194304K
        div     ebx             ; now divide, ignore remainder

;
; We may be off by one, so to check, we will multiply the quotient
; by the divisor and check the result against the orignal dividend
; Note that we must also check for overflow, which can occur if the
; dividend is close to 2**64 and the quotient is off by 1.
;

        mov     ecx,eax         ; save a copy of quotient in ECX
        mul     CRT_HIWORD(DVSR)
        xchg    ecx,eax         ; put partial product in ECX, get quotient in EAX
        mul     CRT_LOWORD(DVSR)
        add     edx,ecx         ; EDX:EAX = QUOT * DVSR
        jc      short L4        ; carry means Quotient is off by 1

;
; do long compare here between original dividend and the result of the
; multiply in edx:eax.  If original is larger or equal, we're ok, otherwise
; subtract the original divisor from the result.
;

        cmp     edx,CRT_HIWORD(DVND) ; compare hi words of result and original
        ja      short L4        ; if result > original, do subtract
        jb      short L5        ; if result < original, we're ok
        cmp     eax,CRT_LOWORD(DVND) ; hi words are equal, compare lo words
        jbe     short L5        ; if less or equal we're ok, else subtract
L4:
        sub     eax,CRT_LOWORD(DVSR) ; subtract divisor from result
        sbb     edx,CRT_HIWORD(DVSR)
L5:

;
; Calculate remainder by subtracting the result from the original dividend.
; Since the result is already in a register, we will perform the subtract in
; the opposite direction and negate the result to make it positive.
;

        sub     eax,CRT_LOWORD(DVND) ; subtract original dividend from result
        sbb     edx,CRT_HIWORD(DVND)
        neg     edx             ; and negate it
        neg     eax
        sbb     edx,0

;
; Just the cleanup left to do.  dx:ax contains the remainder.
; Restore the saved registers and return.
;

L2:

        pop     ebx

        ret     16
        }

        #undef DVND
        #undef DVSR
    }

    __declspec(naked) void _aullshr()
    {
        __asm
        {
        cmp     cl,64
        jae     short RETZERO

;
; Handle shifts of between 0 and 31 bits
;
        cmp     cl, 32
        jae     short MORE32
        shrd    eax,edx,cl
        shr     edx,cl
        ret

;
; Handle shifts of between 32 and 63 bits
;
MORE32:
        mov     eax,edx
        xor     edx,edx
        and     cl,31
        shr     eax,cl
        ret

;
; return 0 in edx:eax
;
RETZERO:
        xor     eax,eax
        xor     edx,edx
        ret
        }
    }
}

#undef CRT_LOWORD
#undef CRT_HIWORD

#endif

Well, that was a lot.  But we’re not done yet.  I need to now provide the *.h header files referenced in the code above.  If you use the system includes in lieu of these you will assuredly have great difficulties… .

Code: [Select]
// malloc.h
#ifndef malloc_h
#define malloc_h

extern "C" void*  __cdecl malloc     (size_t size                  );
extern "C" void   __cdecl free       (void*  pMem                  );
extern "C" void*  __cdecl realloc    (void*  pMem,   size_t size   );
extern "C" void*  __cdecl calloc     (size_t nitems, size_t size   );
extern "C" void*  __cdecl _nh_malloc (size_t size,   int    nhFlag );
extern "C" size_t __cdecl _msize     (void*  pMem                  );

#endif


Code: [Select]
#ifndef memory_h
#define memory_h

extern "C" void*     __cdecl memset  (void*    p,     int     c,    size_t count );
extern "C" wchar_t*  __cdecl wmemset (wchar_t* p,     wchar_t c,    size_t count );
//extern "C" void*     __cdecl memcpy  (void*    pDest, void*   pSrc, size_t iCount);  // Gives Problems!

#endif


Code: [Select]
// stdio.h
#ifndef stdio_h
#define stdio_h

extern "C" char __cdecl getchar();

extern "C" int  __cdecl printf(const char* format, ...);
extern "C" int  __cdecl wprintf(const wchar_t* format, ...);
extern "C" int  __cdecl sprintf(char* buffer, const char* format, ...);
extern "C" int  __cdecl swprintf(wchar_t* buffer, const wchar_t* format, ...);

extern "C" size_t __cdecl FltToCh(char* p, double x, size_t iFldWthTChrs, int iDecPlaces, char cDecimalSeperator, bool blnRightJustified);
extern "C" size_t __cdecl FltToWch(wchar_t* p, double x, size_t iFldWthTChrs, int iDecPlaces, wchar_t cDecimalSeperator, bool blnRightJustified);

#endif
Code: [Select]
// stdlib.h
#ifndef stdlib_h
#define stdlib_h
   #define NULL 0
   extern "C" void*   __cdecl malloc  (size_t          size);
   extern "C" void    __cdecl free    (void*           pMem);
   extern "C" long    __cdecl atol    (const char*     pStr);
   extern "C" int     __cdecl atoi    (const char*     pStr);
   extern "C" long    __cdecl _wtol   (const wchar_t*  pStr);
   extern "C" _int64  __cdecl _atoi64 (const char*     pStr);
   extern "C" _int64  __cdecl _wtoi64 (const wchar_t*  pStr);
   extern "C" int     __cdecl abs     (int             n   );
   extern "C" long    __cdecl labs    (long            n   );
   extern "C" _int64  __cdecl _abs64  (__int64         n   );
#endif

Code: [Select]
// String.h
#ifndef string_h
#define string_h

extern "C" size_t   __cdecl strlen(const char* pStr);
extern "C" size_t   __cdecl wcslen(const wchar_t* pStr);

extern "C" char*    __cdecl strcpy(char* strDestination, const char* strSource);
extern "C" wchar_t* __cdecl wcscpy(wchar_t* strDestination, const wchar_t* strSource);

extern "C" char*    __cdecl strcat(char* strDest, const char* strSource);
extern "C" wchar_t* __cdecl wcscat(wchar_t* strDest, const wchar_t* strSource);

extern "C" int      __cdecl strcmp(const char* string1, const char* string2);
extern "C" int      __cdecl wcscmp(const wchar_t* string1, const wchar_t* string2);

extern "C" int      __cdecl strncmp(const char* str1, const char* str2, size_t count);
extern "C" int      __cdecl wcsncmp(const wchar_t* str1, const wchar_t* str2, size_t count);

extern "C" int      __cdecl _strnicmp(const char* str1, const char* str2, size_t count);
extern "C" int      __cdecl _wcsnicmp(const wchar_t* str1, const wchar_t* str2, size_t count);

extern "C" char*    __cdecl strncpy(char* strDest, const char* strSource, size_t iCount);   
extern "C" wchar_t* __cdecl wcsncpy(wchar_t* strDest, const wchar_t* strSource, size_t iCount); 

extern "C" char*    __cdecl _strrev(char* pStr);
extern "C" wchar_t* __cdecl _wcsrev(wchar_t* pStr);

#endif


Code: [Select]
// tchar.h
#ifndef tchar_h
   #define tchar_h
   #ifdef  _UNICODE
      typedef wchar_t     TCHAR;
      #define _T(x)       L## x
      #define _tmain      wmain
      #define _tWinMain   wWinMain
      #define _tprintf    wprintf
      #define _stprintf   swprintf
      #define _tcslen     wcslen
      #define _tcscpy     wcscpy
      #define _tcscat     wcscat
      #define _tcsncpy    wcsncpy
      #define _tcscmp     wcscmp
      #define _tcsncmp    wcsncmp
      #define _tcsnicmp   _wcsnicmp
      #define _tcsrev     _wcsrev
      #define FltToTch    FltToWch
      #define _ttol       _wtol
      #define _ttoi64     _wtoi64
   #else
      typedef char        TCHAR;
      #define _T(x)       x
      #define _tmain      main
      #define _tWinMain   WinMain
      #define _tprintf    printf
      #define _stprintf   sprintf
      #define _tcslen     strlen
      #define _tcscpy     strcpy
      #define _tcscat     strcat
      #define _tcsncpy    strncpy
      #define _tcscmp     strcmp
      #define _tcsncmp    strncmp
      #define _tcsnicmp   _strnicmp
      #define _tcsrev     _strrev
      #define FltToTch    FltToCh
      #define _ttol       atol
      #define _ttoi64     _atoi64
   #endif
#endif

That’s It!!!

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #6 on: March 23, 2016, 01:30:33 AM »
And here is the zip...

Offline James C. Fuller

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 595
  • User-Rate: +11/-8
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #7 on: March 23, 2016, 01:34:28 PM »
Fred,
  I'm a bit busy trying to track down the abnormalities of VS 2015 in my latest bc9Basic post, but I will get back and read this thread soon.
Looks like very interesting info as always.
Is the TCLib.LIB in the zip 64bit?
Great work!!
James

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #8 on: March 23, 2016, 06:16:04 PM »
Yes, I attached the 64 bit version from my new laptop (Win 10) created with VS 2015 Community.  Should be easy to compile & run those demo programs with it.  Man, I'm completely burned out on that project.  I mean completely!  Not that I wouldn't be more than happy to help anyone interested in it.  But last night after posting all that I picked up my banjo and started practicing that again!  Hadn't touched it since about Christmas.  Gotta do something different!

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #9 on: March 24, 2016, 02:44:09 AM »
The zip file contains all source code that needs to be compiled either into the Lib, or the demo programs into exes which use the lib.  However, I did include TCLib.lib in 64 bit format.  So to see how it all works, all you would really have to do is extract the files into a folder of your choice, then run that vcvarsall.bat file that Microsoft puts shortcuts on your Start Menu to when you install Visual Studio, or the SDK.  The lib is 64 bit, so you would have to make sure you used a shortcut that puts cl in 64 bit mode.  I believe with the Windows 7 SDK they have a setenv program that is used to specify x86 or x64 in debug or release mode.  Anyway, you would have to do whatever it takes to get the compiler set up for 64 bit compiles.

Then at the top of each demo program as the first line is the command line compilation string you could copy and paste into the command prompt window to compile the demo against TCLib.lib.  I'll give a demo in a minute after I switch over to my other log in with administrator privilidges and make some screen shots...

I downloaded the zip and extracted it to...

C:\Code\VStudio\VC++15\LibCTiny\x64\Test18

Then I opened my command prompt window with the shortcut provided by Microsoft (it was on my Start Menu) for 64 bit compiles.  Next I opened Demo16.cpp in Notepad, and copied the top line, which is the command line compilation string for Demo16.cpp...

cl Demo16.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib

Then paste that into the command prompt window and hit [ENTER].  Here's my screen...

Code: [Select]
C:\Code\VStudio\VC++15\LibCTiny\x64\Test18>cl Demo16.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

Demo16.cpp
Strings.cpp
Generating Code...
Microsoft (R) Incremental Linker Version 9.00.21022.08
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:Demo16.exe
TCLib.lib
kernel32.lib
Demo16.obj
Strings.obj

C:\Code\VStudio\VC++15\LibCTiny\x64\Test18>Demo16.exe
    123,457
       -457
     10,000
         -1
          0
          2
          2


  123,456.8
     -457.0
   10,000.0
       -1.0
        0.0
        2.0
        2.0


 123,456.79
    -456.99
  10,000.00
      -0.99
       0.00
       1.98
       1.98

C:\Code\VStudio\VC++15\LibCTiny\x64\Test18>

This is Demo16.cpp

Code: [Select]
// cl Demo16.cpp Strings.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 9,728 Bytes With Full String Class, Which Is Mostly Needed
#define UNICODE
#define _UNICODE
#include <windows.h>
#include "stdlib.h"
#include "stdio.h"
#include "tchar.h"
#include "Strings.h"
extern "C" int _fltused=1;

int _tmain()
{
 double dblNums[]={123456.78912, -456.9876, 9999.99999, -0.987654, 0.0, 1.985, 1.9754321};
 String s1;
 
 for(size_t i=0; i<=2; i++)
 {
     for(size_t j=0; j<sizeof(dblNums)/sizeof(dblNums[0]); j++)
     {
         s1.Format(dblNums[j], 12, i, _T(','), _T('.'), true);
         s1.Print(true);
     }
     printf("\n\n");
 }
 getchar();
 
 return 0;
}

Making the lib couldn't be easier.  Just type nmake TCLib.mak at the command prompt and hit [ENTER].  If you leave the TCLib.lib there from the download it'll be overwritten.  The make file will run through all the *.cpp files and compile them into *.obj files.  Then it'll create the library and put the obj files in the lib.  You'll still have all the newly created obj files though.  You can delete them if you want after the lib is created.  Or just use the one I provided.  If you want an x86 lib though you'll have to create it as I described in my main writeup.
« Last Edit: March 24, 2016, 03:03:28 AM by Frederick J. Harris »

Offline James C. Fuller

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 595
  • User-Rate: +11/-8
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #10 on: March 25, 2016, 01:25:12 PM »
Fred,
  I know your String class is your primary string object but a function missing that I use quite often is _stricmp and it's com padre _wcsicmp.
Any reason you don't have them?

James

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #11 on: March 27, 2016, 04:41:39 AM »
I know its hard to read, but that's one of the absolutely 'must have' string.h functions I had to write from scratch for my TCLib.  Here's the listing from the make file arranged vertically for easier reading...

crt_con_a.obj
crt_con_w.obj
crt_win_a.obj
crt_win_w.obj
memset.obj
newdel.obj
printf.obj
sprintf.obj
_strnicmp.obj
strncpy.obj
strncmp.obj
_strrev.obj
strcat.obj
strcmp.obj
strcpy.obj
strlen.obj
getchar.obj
alloc.obj
alloc2.obj
allocsup.obj
FltToCh.obj
atol.obj
_atoi64.obj
abs.obj
memcpy.obj
win32_crt_math.obj

Its 9th one down.  Its used in String::InStr() to find matches as the BASIC implementations do, and also in String::Remove().  Both those members have a blnCaseSensitive flag.  If blnCaseSensitive is true, then strncmp is used.  If case insensitive matches or removals is specified, then _strnicmp is used.  All those functions listed above can be used without my String Class; its just that I had to code them as foundational procedures upon which my String Class would be based. 

Catechism question...

How are C++ Classes Implemented?

Answer...

Through C Standard Library functions. 

Here would be Demo17 and Demo18 which illustrate use of strncmp and _strnicmp 'in the raw', so to speak, without my String Class...

Code: [Select]
// cl Demo17.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 3,584 Bytes
#include <windows.h>
#include "stdio.h"

int main()
{
 char szFirst[]="Frederick";
 char szSecond[]="Fred";

 if(!strncmp(szFirst,szSecond,4))
    printf("Strings Are Equal!\n");
 else
    printf("Strings Are Not Equal!\n");
 getchar();
 
 return 0;
}

/*
   strncmp, wcsncmp used in String::InStr(), String::Remove() and String::Replace()
*/

Code: [Select]
// cl Demo18.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib
// 3,584 Bytes
#include <windows.h>
#include "stdio.h"

int main()
{
 char szFirst[]="Frederick";
 char szSecond[]="fred";

 if(!strnicmp(szFirst,szSecond,4))
    printf("Strings Are Equal!\n");
 else
    printf("Strings Are Not Equal!\n");
 getchar();
 
 return 0;
}

/*
   strncmp, wcsncmp used in String::InStr() and String::Remove()
*/

As you can see, I'm only at 3,584 bytes for both of those examples in x64, UNICODE.  Of course, that could be coded with my String Class if the bloat would be acceptable.  It would likely cost you every bit of 2 or 3 k of bloat, and you might end up with a mammoth executable in the 5 k range.

Couldn't answer your question right away.  Had to go to a funeral today (it wasn't mine :) ).

Did you ever get your display anomilies fixed?

Here is my final version of your ec02 program with debug capabilities built in and comments.  I've tested it with VC15, VC19, GCC 4.8 in x86 and x64 at various DPI Display Resolution settings, and at least from my end everything looks OK...

Code: [Select]
// Uses/links with  LIBCMT:   cl ec02a.cpp /O1 /Os /MT kernel32.lib user32.lib gdi32.lib
// Uses/links with TCLib:     cl ec02a.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib user32.lib gdi32.lib
//  5,632 Bytes               TCLib, x86, VC19, UNICODE
//  6,144 Bytes               TCLib, x64, VC15, UNICODE
//  7,168 Bytes               TCLib, x64, VC19, UNICODE
// 19,968 Bytes               GCC 4.8, x64, UNICODE
// 88,064 Bytes               VC15, x64 Std. UNICODE, Build With C Runtime
#define TCLib
#ifndef UNICODE
   #define UNICODE            // App demonstrates Dpi Aware coding.  Note, you can't build with both
#endif                        // #define TCLib and #define Debug.  As of yet I have not provided
#ifndef _UNICODE              // FILE* i/o capabilities in TCLib.  If TCLib is defined, TCLib specific
   #define _UNICODE           // versions of stdio.h and tchar.h must be used.  Hence, those files
#endif                        // need to be placed in the app's directory, or pathed specifically.
#include <windows.h>
#ifdef TCLib                  // I'm explicitely calling SetProcessDPIAware() in SetMyProcessDpiAware()
   #include "stdio.h"         // because that function isn't available in all build environments
   #include "tchar.h"         // out there.
#else   
   #include <stdio.h>
   #include <tchar.h>
#endif   
#include "Form1.h"            // If Debug is defined an Output.txt Log File is created, and debug
//#define Debug               // information logged to there.
#ifdef Debug
   FILE* fp=NULL;
#endif


void SetMyProcessDpiAware()   // This function isn't available on the older GCC versions I have.
{                             // Calling this function tells Windows the app is High DPI Aware,
 BOOL (__stdcall* pFn)(void); // and so Windows won't virtualize fonts and sizes of things and
                              // wreck them.
 HINSTANCE hInstance=LoadLibrary(_T("user32.dll"));
 if(hInstance)
 {
    pFn=(BOOL (__stdcall*)(void))GetProcAddress(hInstance,"SetProcessDPIAware");
    if(pFn)
       pFn();
    FreeLibrary(hInstance);
 }
}


LRESULT fnWndProc_OnCreate(WndEventArgs& Wea)
{
 BOOL blnDpiAware=FALSE;      // This is the creation/instantiation function, i.e., Constructor
 DpiData* pDpiData=NULL;      // for the "Form1" Class registered in WinMain, i.e., the main
 HANDLE hHeap=NULL;           // and only app window or start up form/dialog/window.  In WinMain
 HFONT hFont1=NULL;           // I allocate three 'spots' for app specific data in the
 HFONT hFont2=NULL;           // WNDCLASS::cbWndExtra bytes.  That would be 3 * sizeof(void*) or
 int dpiX,dpiY;               // 24 bytes in x64 and 12 bytes in x86.  I store the DpiData* in
 HDC hDC=NULL;                // offset #1, hFont1 in offset #2, and hFont2 in offset #3.
 HWND hCtl;
 
 #ifdef Debug
 fp=fopen("Output.txt","w");
 fprintf(fp,"Entering fnWndProc_OnCreate()\n");
 fprintf(fp,"  Wea.hWnd         = %p\n",Wea.hWnd);
 fprintf(fp,"  sizeof(LRESULT)  = %u\n",sizeof(LRESULT));
 #endif
 Wea.hIns    = ((LPCREATESTRUCT)Wea.lParam)->hInstance;
 hHeap       = GetProcessHeap();
 blnDpiAware = IsProcessDPIAware();
 #ifdef Debug
 fprintf(fp,"  blnDpiAware      = %d\n",blnDpiAware);
 #endif
 
 // Deal With DPI                                 // This code will normalize, fix, whatever you want
 hDC          = GetDC(NULL);                      // to call it, the bad things that happen on modern
 dpiX         = GetDeviceCaps(hDC, LOGPIXELSX);   // displays when the user alters DPI settings through
 dpiY         = GetDeviceCaps(hDC, LOGPIXELSY);   // Control Panel.  If code like this isn't implemented,
 pDpiData     = (DpiData*)HeapAlloc               // and the user alters DPI settings from those used
 (                                                // by the coder (you) when the app was developed, text
  hHeap,                                          // and controls will almost certainly be 'clipped', and
  HEAP_ZERO_MEMORY,                               // fonts will be 'virtualized' such that they'll become
  sizeof(DpiData)                                 // fuzzy.  Note I have a DpiData struct defined in
 );                                               // Form1.h.  The purpose of that is to persist DPI
 if(!pDpiData)                                    // factors across function calls.  Not needed here, but
    return -1;                                    // useful in more complex apps.
 SetWindowLongPtr(Wea.hWnd,0*sizeof(void*),(LONG_PTR)pDpiData);
 pDpiData->rx = dpiX/96.0;
 pDpiData->ry = dpiY/96.0;
 ReleaseDC(Wea.hWnd,hDC);
 #ifdef Debug
 fprintf(fp,"  pDpiData         = %p\n",pDpiData);
 fprintf(fp,"  dpiX             = %d\n",dpiX);
 fprintf(fp,"  dpiY             = %d\n",dpiY);
 fprintf(fp,"  pDpiData->rx     = %5.3f\n",pDpiData->rx);
 fprintf(fp,"  pDpiData->ry     = %5.3f\n",pDpiData->ry);
 #endif
 
 // Position/Center Main App Window
 int iCxScreen=GetSystemMetrics(SM_CXSCREEN);
 int iCyScreen=GetSystemMetrics(SM_CYSCREEN);
 int x = (iCxScreen/2 - 338/2)/pDpiData->rx;
 int y = (iCyScreen/2 - 320/2)/pDpiData->ry;
 MoveWindow(Wea.hWnd,SizX(x),SizY(y),SizX(338),SizY(320),FALSE);
 #ifdef Debug
 fprintf(fp,"  iCxScreen        = %d\n",iCxScreen);
 fprintf(fp,"  iCyScreen        = %d\n",iCyScreen);
 fprintf(fp,"  x                = %d\n",x);
 fprintf(fp,"  y                = %d\n",y);
 #endif
 
 // Instantiate Child Window Objects
 double dblFont1 = 10.0;
 double dblFont2 = 10.0;
 DWORD  dwStyles = WS_CHILD|WS_VISIBLE|WS_BORDER|ES_AUTOVSCROLL|ES_WANTRETURN|ES_AUTOHSCROLL|ES_MULTILINE;
 
 hCtl=CreateWindow(_T("edit"),NULL,dwStyles,SizX(15),SizY(15),SizX(295),SizY(112),Wea.hWnd,(HMENU)IDC_EDIT1,Wea.hIns,NULL);
 hFont1=CreateFont(-MulDiv(dblFont1,dpiY,72),0,0,0,FW_NORMAL,0,0,0,ANSI_CHARSET,OUT_TT_PRECIS,CLIP_DEFAULT_PRECIS,DEFAULT_QUALITY,FF_DONTCARE,_T("Arial"));
 SetWindowLongPtr(Wea.hWnd,1*sizeof(void*),(LONG_PTR)hFont1);
 SendMessage(hCtl, (UINT)WM_SETFONT, (WPARAM)hFont1, (LPARAM)TRUE);
 SetWindowText(hCtl, _T(" Text in this control set to RED ON WHITE"));
 
 hCtl=CreateWindow(_T("edit"),NULL,dwStyles,SizX(15),SizY(160),SizX(295),SizY(112),Wea.hWnd,(HMENU)IDC_EDIT2,Wea.hIns,NULL);
 hFont2=CreateFont(-MulDiv(dblFont2,dpiY,72),0,0,0,FW_NORMAL,0,0,0,ANSI_CHARSET,OUT_TT_PRECIS,CLIP_DEFAULT_PRECIS,DEFAULT_QUALITY,FF_DONTCARE,_T("Garamond"));
 SetWindowLongPtr(Wea.hWnd,2*sizeof(void*),(LONG_PTR)hFont2);
 SendMessage(hCtl, (UINT)WM_SETFONT, (WPARAM)hFont2, (LPARAM)TRUE);
 SetWindowText(hCtl, _T(" Text in this control set to Green ON BLACK"));
 #ifdef Debug
 fprintf(fp,"Leaving fnWndProc_OnCreate()\n\n");
 #endif
 
 return 0;
}


LRESULT fnWndProc_OnCtlColorEdit(WndEventArgs& Wea)
{
 int iCtlId=GetDlgCtrlID((HWND)Wea.lParam);
 
 if(iCtlId==IDC_EDIT1)
 {
    SetTextColor((HDC)Wea.wParam, RGB(255, 0, 0));
SetBkMode((HDC)Wea.wParam,TRANSPARENT);
return (LRESULT)GetStockObject(WHITE_BRUSH);
 }
 if(iCtlId==IDC_EDIT2)
 {
    SetTextColor((HDC)Wea.wParam, RGB(0, 255, 0));
SetBkMode((HDC)Wea.wParam,TRANSPARENT);
return (LRESULT)GetStockObject(BLACK_BRUSH);
 }  
 
 return DefWindowProc(Wea.hWnd, WM_CTLCOLOREDIT, Wea.wParam, Wea.lParam);
}


LRESULT fnWndProc_OnDestroy(WndEventArgs& Wea)
{
 DpiData* pDpiData=NULL;
 BOOL blnFree=FALSE;
 HFONT hFont1=NULL;
 HFONT hFont2=NULL;
 HFONT hFont=NULL;
 
 #ifdef Debug
 fprintf(fp,"Entering fnWndProc_OnDestroy()\n");
 fprintf(fp,"  Wea.hWnd         = %p\n",Wea.hWnd);
 #endif
 pDpiData=(DpiData*)GetWindowLongPtr(Wea.hWnd,0*sizeof(void*));
 blnFree=HeapFree(GetProcessHeap(),0,pDpiData);
 #ifdef Debug
 fprintf(fp,"  pDpiData         = %p\n",pDpiData);
 fprintf(fp,"  blnFree          = %d\n",blnFree);
 #endif
 hFont=(HFONT)GetWindowLongPtr(Wea.hWnd,1*sizeof(void*));
 blnFree=DeleteObject(hFont);
 #ifdef Debug
 fprintf(fp,"  hFont1           = %p\n",hFont1);
 fprintf(fp,"  blnFree(hFont1)  = %d\n",blnFree);
 #endif
 hFont=(HFONT)GetWindowLongPtr(Wea.hWnd,2*sizeof(void*));
 blnFree=DeleteObject(hFont);
 #ifdef Debug
 fprintf(fp,"  hFont2           = %p\n",hFont2);
 fprintf(fp,"  blnFree(hFont2)  = %d\n",blnFree);
 #endif
 PostQuitMessage(0);
 #ifdef Debug
 fprintf(fp,"Leaving fnWndProc_OnDestroy()\n");
 fclose(fp);
 #endif
 
 return 0;
}


LRESULT CALLBACK fnWndProc(HWND hwnd, unsigned int msg, WPARAM wParam, LPARAM lParam)
{
 WndEventArgs Wea;

 for(unsigned int i=0; i<dim(EventHandler); i++)
 {
     if(EventHandler[i].iMsg==msg)
     {
        Wea.hWnd=hwnd, Wea.lParam=lParam, Wea.wParam=wParam;
        return (*EventHandler[i].fnPtr)(Wea);
     }
 }

 return (DefWindowProc(hwnd, msg, wParam, lParam));
}


int WINAPI _tWinMain(HINSTANCE hIns, HINSTANCE hPrevIns, LPTSTR lpszArgument, int iShow)
{
 TCHAR szClassName[]=_T("Form1");
 MSG messages;
 WNDCLASS wc;
 HWND hWnd;

 memset(&wc,0,sizeof(WNDCLASS));
 SetMyProcessDpiAware();
 wc.lpszClassName = szClassName,                     wc.lpfnWndProc   = fnWndProc;
 wc.hIcon         = LoadIcon(NULL,IDI_APPLICATION),  wc.hbrBackground = (HBRUSH)(COLOR_BTNFACE+1);
 wc.hInstance     = hIns,                            wc.hCursor       = LoadCursor(NULL,IDC_ARROW),         
 wc.cbWndExtra    = 3*sizeof(void*);
 RegisterClass(&wc);
 hWnd=CreateWindow(szClassName,_T("Edit Ctrl Colors and Fonts"),WS_OVERLAPPEDWINDOW,0,0,0,0,HWND_DESKTOP,0,hIns,0);
 ShowWindow(hWnd,iShow);
 while(GetMessage(&messages,NULL,0,0))
 {
    TranslateMessage(&messages);
    DispatchMessage(&messages);
 }

 return messages.wParam;
}

Code: [Select]
//Form1.h
#ifndef Form_h
#define Form_h

#define IDC_EDIT1                 2000
#define IDC_EDIT2                 2005
#define dim(x)                    (sizeof(x) / sizeof(x[0]))
#define SizX(x)                   (int)(x * pDpiData->rx)       // For DPI Scaling Calculations And Purposes
#define SizY(y)                   (int)(y * pDpiData->ry)       // For DPI Scaling Calculations And Purposes
#ifdef TCLib
   extern "C" int                 _fltused=1;
#endif
   
struct WndEventArgs
{
 HWND                             hWnd;
 WPARAM                           wParam;
 LPARAM                           lParam;
 HINSTANCE                        hIns;
};

LRESULT fnWndProc_OnCreate        (WndEventArgs& Wea);
LRESULT fnWndProc_OnCtlColorEdit  (WndEventArgs& Wea);
LRESULT fnWndProc_OnDestroy       (WndEventArgs& Wea);

struct EVENTHANDLER
{
 unsigned int                     iMsg;
 LRESULT                          (*fnPtr)(WndEventArgs&);
};

const EVENTHANDLER EventHandler[]=
{
 {WM_CREATE,                      fnWndProc_OnCreate},
 {WM_CTLCOLOREDIT,                fnWndProc_OnCtlColorEdit},
 {WM_DESTROY,                     fnWndProc_OnDestroy}
};

struct DpiData                   
{                                 
 double                           rx;
 double                           ry;
};

#endif

I added a #define TCLib in the program which you can comment out for compiling with TCLib.  Then there is a #define Debug which opens an output log file.  You can't do that if #define TCLib is enabled though, because as of yet I haven't implemented file output into it yet.  That's near the top of my todo list, i.e., adding file support to TCLib for debugging purposes.

Offline Frederick J. Harris

  • Hero Member
  • *****
  • Posts: 914
  • User-Rate: +16/-0
    • Frederick J. Harris
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #12 on: March 27, 2016, 04:47:45 AM »
All this work I had to do to try to make a usable language out of C++.   ;D   All the work they've done with C++ since C99!  Too bad they didn't make a usable language out of it instead of giving us stuff like lambda functions. :(
« Last Edit: March 27, 2016, 04:53:47 AM by Frederick J. Harris »

Offline James C. Fuller

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 595
  • User-Rate: +11/-8
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #13 on: March 27, 2016, 11:58:32 AM »
Fred,
  The 9th one down is _strnicmp.obj.
I want stricmp, without the length, which should be able to be implemented the same as you did strcmp. These are both in kernel also.
James

Code: [Select]
#include <windows.h>
#include "string.h"

int __cdecl stricmp(const char* string1, const char* string2)   
{
 return lstrcmpiA(string1,string2);
}

int __cdecl wcsicmp(const wchar_t* string1, const wchar_t* string2)   
{
 return lstrcmpiW(string1,string2);
}

Offline James C. Fuller

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 595
  • User-Rate: +11/-8
Re: Minimize Program Size By Eliminating The C Runtime Library
« Reply #14 on: March 27, 2016, 12:12:06 PM »
Fred,
  How difficult would it be to add the "+=" operator to your String class?
bc9Basic optimizes (and it's buried deep in old bcx code) a = a + z to a+= z.

James