README file for matlab2fmex.m CONTENTS: 0. DISCLAIMER 1. OBJECTIVE 2. MOTIVATION 3. BUG REPORTS and WISH LIST 4. MATLAB2FMEX CAPABILITIES 5. MATLAB2FMEX LIMITATIONS 6. HOW TO US MATLAB2FMEX 7. EXAMPLES 8. REVISION HISTORY 0. DISCLAIMER: Matlab is a trademark of the Mathworks company and is owned by them. The author makes no guarantee express or implied of any kind as to the applicability, usefulness, efficacy, bug-freeness, or the accuracy of the ensuing results from using matlab2fmex. The author bears no responsibility for any unwanted effect resulting from the use of this program. The author is not affiliated with the Mathworks. The source code is given in full in the hopes that it will prove useful. Devlopment is done through sourceforge at matlab2fmex.sourceforge.net. 1. OBJECTIVE: matlab2fmex.m is a small translator which converts numerical Matlab m-files to Fortran90 mex files and then compiles with Matlab's mex. 2. MOTIVATION: 1) Matlab is becoming ubiquitous in the engineering and scientific communities for its ease of use coupled with its powerful libraries. Yet the biggest disadvantage to Matlab is that it is slow when compared to optimized and compiled languages such as C and especially Fortran. With the recent 6.5+ release, Matlab execution speed has been improved, but compiled fortran is almost always at least a few times faster. 2) While Matlab provides a built-in compiler (mcc), this, too, seems slow and in fact many times produces executables whose performance is similar or worse to their parent m-file (see examples below). If one is willing to forgo some of the fancier things of Matlab like cells, structs, variable retyping, variable resizing, etc., while still maintaining Matlab's algorithm devolpment environment, then significant speedups can be achieved. The intent of matlab2fmex is to retain the advantages of both languages: use Matlab for development and debugging, then convert the numerically intensive routines to mex files for speed. 3) Fortran90/95/2000/etc syntax is growing closer to Matlab's and vice versa. 3. BUG REPORTS and WISH LIST: For all bug reports, a wish list for matlab2fmex, and suggestions, see http://matlab2fmex.sourceforge.net/ or email barrowes@users.sourceforge.net Also see hints file accompanying this distribution for other tips. 4. MATLAB2FMEX CAPABILITIES: matlab2fmex is aimed at converting computationally intensive numerical functions to Fortran90 mex-files. As such, only basic data types and constructions are recommended. matlab2fmex can handle: integer, real, and complex scalars and 2-D arrays loop and branch constructs array indexing mathematical operations simple math functions using Fortran intrinsics, e.g. trig functions etc. Many Matlab functions using Fortran90 generic interfaces Most other Matlab functions by calling Matlab at execution time. (callback are slow, however) variable size inputs and outputs (see below for restrictions***) logical operator conversion multiple functions (helper functions => Fortran subroutines) Integral colon expressions __must have brackets [ and ] around them__ 5. MATLAB2FMEX LIMITATIONS: matlab2fmex was not made for replacing Matlab's extensive non-numerical capabilities. Thus, functionality for GUIs, debugging, java, objects, etc. are not included. As a general rule, only core numeric code should be attempted to be converted. Things to be avoided: cells structs strings (exception: in disp() and error() statements) sparse data uintxx intxx objects the same uppercase and lowercase variable names (e.g. both INDEX and index as variables) inline functions java anything anything but 2-D doubles (but can be real or complex) dynamic array resizing for local variables (see below for restrictions***) *.m-file function conversion with no outputs statements separated by commas instead of semicolons loading and saving *.mat files (on wish list) *** Inputs and outputs may change in size, but variables local to the m-file may not _unless_ their dimension(s) are the same size as an input variables' dimension(s). No variable is allowed to change in size after function instantiation. 6. HOW TO USE MATLAB2FMEX: There are 2 steps required to convert a Matlab m-file(s) to a mex-able Fortran90 file. 1) matlab2fmex needs to find a *.mat data file which contains a typical workspace generated by the function which is being converted. This is easily constructed by inserting a keyboard command at the end of the *.m file, then, at the k>> prompt simply type "save *" where * is the name of the *.m file without the .m extension. matlab2fmex uses this workspace to make decisions which depend on the variables in the *.m file, such as real or complex. Accordingly, if an variable may be complex, it is best to cause that variable to be complex when the workspace is saved so that the *.f90 file will be as robust as possible. When converting multiple *.m files (a main function and helper functions), a separate *.mat file must exist for each function. Alternatively, you can use matlab2fmex_save to save a typical workspace file. Simply execute: matlab2fmex_save('function_call_string'); and the workspace file will be saved for you. (See below for some examples and further explanation.) 2) The second step is to call matlab2fmex. The format is the following: matlab2fmex({'filename1' 'filename2' ...},{[compiler flags for main routine (filename1)],[compiler flags for subroutine 1 (filename2)],...}); The result will be a fortran mex file filename1.f90 along with the compiled mex function filename1.mex. This mex function should be callable with the same syntax as the original function. When sizing output and local variables, the converter looks for any input variable dimension which happens to be the same as each output or local variable dimension. That dimension of the local variable is then assigned that dimension of that input variable. Upon failing to find an input variable with the same dimensional size, the converter declares that dimension of the local variable to be the _static_ size it is is the saved workspace. There are a few parameters which the user can specify as well: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_kb=0;%want_kb --> 0 Do not stop after each step, compile file. (default) %--> 1 Enter Matlab keyboard after each step.(for debugging) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_ss=0;%want_ss%-->-1 No subscript vectorization. (for mostly scalar operations) %--> 0 Relaxed subscript interpreting. (default) %--> 1 Stricter subscripting (slower, for variable subscripts) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_op=0;%want_op%-->-1 No operator replacement (use when want_ss=-1) %--> 0 Don't replace +, -, or scalar * and / operators. (default) %--> 1 Replace all math operators with interface calls. (slower) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_fb=0;%want_fb --> 0 Don't display progress throughout compilation. (default) %--> 1 Display feedback information. (potentially annoying) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_ct=1;%want_ct --> 0 Don't mex *.f90 upon completion. %--> 1 mex the resulting *.f90 file. (default) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_0=0;%want_mf --> 0 Don't zero all vars %--> 1 Set all vars = 0 before computational routine %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_br=0;%want_br --> 0 Use (/ and /) instead of [ and ] (all Fortran 90's) %--> 1 Use [ and ] (Some Fortran 90's) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_sb=0;%want_sb --> 0 Convert as a full mex-file %--> 1 Convert as a subroutine (sets want_ct to 0). %--> 2 Convert as a function (one output, sets want_ct to 0). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_in=1;%want_in --> 0 Don't allow local vars to be integers %--> 1 Allow local vars to be integers (default) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% want_cs=0;%want_in --> 0 Assume all sqrt's are of real, positive numbers %--> 1 Assume all sqrt's are complex %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Shown here are the default flag settings. Note: Sometimes it is useful to modify the resulting *.f90 to improve performance. Another Note: In your mexopts.sh, you must set the default to double precision or 8 bytes for 1 real. Check your Fortran90 compiler flags for how to do this and add this option to the appropriate 'FFLAGS=' in mexopts.sh. Important: do not set integers to double size, only real and complex data types. 7. EXAMPLES: A few examples from the ~MATLAB/extern/examples/compiler/ from your Matlab distribution with performance comparisons along with some other examples. The following computation times are from a PC running RedHat Linux 9.0 with 1.5GB of RAM and the Intel Fortran90 compiler. All of the following examples can be run by running the script TESTING from the examples directory in the full distribution. If you do not have the Matlab compiler, set the have_mcc flag in the TESTING script to 0. These examples were run under 6.5.1. Further examples adapted from the Falcon/Majic project: http://polaris.cs.uiuc.edu/~galmasi/majic/majic.html can be run by executing the TESTING_uoi.m script in the testsuite_m2f directory. A bar graph comparing execution times will result from the tests. First example: Original file myfunc.m: function z = myfunc(x) % Doc example. Chapter 4. % $Revision: 1.1 $ temp1 = x .* 10 .* sin(x); z = round(temp1); Use matlab2fmex_save (see below) to save a sample workspace: matlab2fmex_save('z=myfunc(rand(3,3));'); When running native matlab: >> x=rand(2000); >> t=cputime;z=myfunc(x);cputime-t ans = 1.25 Evem when we compile using mcc, the performance does not improve: >> mcc -x myfunc.m >> t=cputime;z=myfunc(x);cputime-t ans = 1.25999999999999 >> Now convert it to a mex-file using matlab2fmex: >> matlab2fmex('myfunc'); Converting --- myfunc.m ==> myfunc.f90 ==> myfunc.mex Original: function z = myfunc(x) temp1 = x .* 10 .* sin(x); z = round(temp1); putting decimal points etc. ....................... m finished fixing bracketed assignments ...................... a finished fixing bracketed assignments II ................... t finished operator changeover ............................... l finished making 2 subscripts ............................... a finished fixing logical operators .......................... b finished subscript vectorizing ............................. 2 finished fixing if, for, etc keywords ...................... f finished fixing colon expressons ........................... m finished performing word conversion ........................ e finished fixing assignments ................................ x finished 2 lines Setting the size of output var z equal to the input var x. Setting the size of local var temp1 equal to the size of input var x. Finished writing myfunc.f90: temp1 = x_f * 10.0 * sin(x_f); z_f = ANInt(temp1); Starting compilation of myfunc.f90 Calling mex to compile the output at myfunc.f90 ==> mex -v myfunc.f90 /mit/matlab_v6.5.1/distrib -L/mit/matlab_v6.5.1/distrib/bin/Undetermined -lmx -lmex -lmat /mit/matlab_v6.5.1/distrib -L/mit/matlab_v6.5.1/distrib/bin/glnx86 -lmx -lmex -lmat -> mexopts.sh sourced from directory (DIR = $HOME/.matlab/R13) >> And myfunc.f is: temp1 = x_f * 10.0 * sin(x_f) z_f = ANInt(temp1) (with appropriate supporting gateway and subroutine code before and after) now execute the same call to myfunc: >> t=cputime;z=myfunc(x);cputime-t ans = 0.42 >> Not too bad: almost three times better than both native and mcc compiled Matlab. NOTES: Should there be an error during conversion, the user will be given a choice whether to try to continue, or start a keyboard environment to try to see what the problem is. This (approximate - no comments are included) line number of the *.m file will help to know what line is in question and also provides feedback on the progress of the converter. As another example, consider the routine zsdmn2.m which computes expansion coefficients for spheroidal wave functions. Again, use matlab2fmex_save to save a workspace: >> matlab2fmex_save('out=zsdmn2(0,10,200+200i,500,1,100,zeros(102,1));'); >> First, let's run the original: >> tt=cputime;out=zsdmn2(0,10,200+200i,500,1,10000,zeros(10002,1));cputime-tt ans = 0.64 >> Now convert it: >> matlab2fmex('zsdmn2',[0 0 0 0 1 0 0 0 0 0]); And now run it: >> tt=cputime;out=zsdmn2(0,10,200+200i,500,1,10000,zeros(10002,1));cputime-tt ans = 0.03 >> This yields a better speedup: >> 0.64/.03 21.3333333333333 REMARK: Matlab functions which don't have an equivalent fortran intrinsic and do not yet have a conversion routine built in are dealt with in one of two ways: 1) Call Matlab back from the converted file, or 2) Use slatec to provide a few more functions. The second category contains functions such as the bessel functions (it is an ongoing process to convert all functions possible to native fortran through libraries). Obviuously, to use these the user must have the slatec and lapack libraries installed in their system somewhere. Note: some functions such as inv and fft are so optimized in Matlab that they are intentionally left as callbacks as the compiled fortran would not offer substantial performance increases. To illustrate, consider the next example which is gasket.m from the Matlab examples directory, but slightly contrived in order to demonstrate some of the capabilities of matlab2fmex. imsize was added so that theImage output variable could be dynamically sized, and randsize was added as an input for the size of theRand. Without randsize, the size of theRand would be statically defined. This procedure of putting in dummy variables is necessary whenever an output or internal variable has to be sized depending on the input values, but none of the input variables themselves are of the same size. function theImage = gasket(numPoints,imsize,randsize) %GASKET An image of a Sierpinski Gasket. theImage = zeros(1000,1000); corners = ([866 1;1 500;866 1000].')';startPoint = [866 1]; theRand = ceil(rand(numPoints,1)*3); for i=1:numPoints startPoint = floor((corners(theRand(i),:)+startPoint)/2); theImage(startPoint(1),startPoint(2)) = 1; end plot(theRand); startPoint(1,1:2)=max(corners); disp(['startPoint=',num2str(startPoint)]); theImage(theImage>0)=asech(1/2); In Matlab this takes: >> n=5000000;imsize=zeros(1,1000);t=cputime;g=gasket(n,imsize,zeros(n,1));cputime-t startPoint=866 1000 ans = 10.46 >> After adding imsize and randsize to the input variable list and saving the workspace with >> n=500000;imsize=zeros(1,1000); >> matlab2fmex_save('g=gasket(n,imsize,zeros(n,1));') we use matlab2fmex: >> matlab2fmex('gasket'); Converting --- gasket.m ==> gasket.f90 ==> gasket.mex Starting compilation of gasket.f Creating mexcallback module... Creating mexoperators module... Creating mexfunctions module... Calling mex to compile the output at gasket.f ==> mex gasket.f mexfunctions.f mexoperators.f mexcallback.f -lslatec -llapack >> Then we have: >> n=5000000;imsize=zeros(1,1000);t=cputime;g=gasket(n,imsize,zeros(n,1));cputime-t startPoint= 866.000000000000 1000.00000000000 ans = 1.46 >> NOTE: The call to the Matlab function rand was converted as a callback to Matlab, so that the rand statement is actually performed in the Matlab environment and the results will be exactly the same as if from the Matlab prompt. 'plot' was also called as a callback. REMARK: matlab2fmex uses the temporary varaibles mxTR1, mxTR2, ... and mxTC1, mxTC2, ... (as in the example above) when a converted function returns a result whose size is unknown when called (such as the matlab callback rand, in gasket). This is to avoid memory leaks when using pointers as return objects in Fortran90. REMARK: One possible advantage when using matlab2fmex in that little or no effort may be put into vectorizing the *.m file because a good optimizing Fortran90 compiler will optimize these things for you. Another example is internew.m, which is too long to list here, yet shows some of the different capabilities of matlab2fmex. Notable in this example are: -- a single varaible on a line by itself (i on line 6) is displayed with its value by the mex-file as well. -- error messages are converted into fortran90 error message calls -- other converted function such as linspace, min, max. For internew.m we have: >> load internew;Z=ones(length(bf),length(bf));vs=zeros(pos^2,pos^2);as=zeros(1,pos+1);t=cputime; >> t=cputime;Z=internew(v,c,bf,bfdir,clim,E0,k,pos,omega,mu0,epsilon0,vs,as,Z);cputime-t i = 10 i = 20 i = 30 ans = 6.85 >> Now try matlab2fmex: >> matlab2fmex('internew'); But I get some compiler errors: Converting --- internew.m ==> internew.f ==> internew.mex matlab2fmex.m ==> mex internew.f mexfunctions.f mexoperators.f mexcallback.f f90: Error: internew.f, line 276: Each ac-value expression in an array-constructor must have the same type and type parameters. [PAD_ZP] pad_p=mxa((/pad_xp,pad_yp,pad_zp/)) --------------------------------^ f90: Error: internew.f, line 283: Each ac-value expression in an array-constructor must have the same type and type parameters. [PAD_ZPP] pad_pp=mxa((/pad_xpp,pad_ypp,pad_zpp/)) -----------------------------------^ f90: Error: internew.f, line 287: Each ac-value expression in an array-constructor must have the same type and type parameters. [SPACE_ZP] space_p=mxa((/space_xp,space_yp,space_zp/)) --------------------------------------^ . . . mex: compile of 'internew.f' failed. ??? Error using ==> mex Unable to complete successfully Error in ==> /home/barrowes/compiler/matlab2fmex.m On line 2099 ==> eval(['mex ',filenamef,' mexfunctions.f mexoperators.f mexcallback.f',libraries]); This is because the converter decided pad_zp, for example, was an integer (1x1 variables with integer values are assigned as integers), while it decided pad_xp and pad_yp were reals. Then when fortran attempts to put them in the same array, it has an error. This problem can be solved by not allowing integers in the converted routine by setting want_in to 0 instead of 1 (default). >> matlab2fmex('internew',[0 0 0 0 1 0 0 0 0]); ^ | here is want_in set to 0 Now we get no errors. Upon running the fortran90 mex file: >> load internew;Z=ones(length(bf),length(bf));vs=zeros(pos^2,pos^2);as=zeros(1,pos+1);t=cputime;Z=internew(v,c,bf,bfdir,clim,E0,k,pos,omega,mu0,epsilon0,vs,as,Z);cputime-t i 10.0000000000000 i 20.0000000000000 i 30.0000000000000 ans = 0.85 About a 7-10 times speedup. To illustrate converting multiple *.m files into the same mex file, consider internew_sep.m. internew_sep.m calls a separate m-file internew_pad.m: pad_xp=internew_pad(bf,clim,pos,1,i,i1); pad_yp=internew_pad(bf,clim,pos,2,i,i1); pad_zp=internew_pad(bf,clim,pos,3,i,i1); internew_pad.m is: function pad=internew_pad(bf,clim,pos,dim,i,j); if dim==1 pad=abs(clim(bf(i,j),2)-clim(bf(i,j),1))/pos/2; elseif dim==2 pad=abs(clim(bf(i,j),4)-clim(bf(i,j),3))/pos/2; elseif dim==3 pad=abs(clim(bf(i,j),6)-clim(bf(i,j),5))/pos/2; end To convert these, there must exist both internew_sep.mat and internew_pad.mat workspace files in the same directory. Running as native Matlab takes about 8 seconds. To convert these functions to 1 mex file internew_sep.mex with internew_pad included as a Fortran90 subroutine (again with no integers): matlab2fmex({'internew_sep' 'internew_pad'},[0 0 0 0 1 0 0 0 0 0]); Whereas before, the filename input was simply a string, now it is a cell array of strings. If we wanted different converter flags for each function we would specify matlab2fmex({'internew_sep' 'internew_pad'},{[0 0 0 0 1 0 0 0 0 0] [0 0 0 0 1 0 0 0 0 1]}); Now converting: >> matlab2fmex({'internew_sep' 'internew_pad'},[0 0 0 0 1 0 0 0 0]); Converting --- internew_sep.m ==> internew_sep.f ==> internew_sep.mex matlab2fmex.m Converting --- internew_pad.m ==> internew_pad.f ==> internew_pad.mex Making subroutine matlab2fmex. ==> mex internew_sep.f mexfunctions.f mexoperators.f mexcallback.f >> load internew_sep;Z=ones(length(bf),length(bf));vs=zeros(pos^2,pos^2);as=zeros(1,pos+1);t=cputime;Z=internew_sep(v,c,bf,bfdir,clim,E0,k,pos,omega,mu0,epsilon0,vs,as,Z);cputime-t i 10.0000000000000 i 20.0000000000000 i 30.0000000000000 ans = 0.71 >> A sppedup of about 11 times. The subroutine (a function in this case due to the single output argument) is found in the module mexfunctions in internew_sep.f90. One possible use for the converter would be simply to make Fortran90 subroutines from Matlab code. Note that all the dummy arguments in these procedures are assumed shape real or complex arrays. In an effort to make the process of saving the workspace a little easier, I have written a small program matlab2fmex_save.m which help in saving the workspace. Consider the function lqmns.m used in the vsph_pro.m program. On a normal run through lqmns.m some of the variables do not get instantiated and thus a simple "save lqmns" at the end of the program will cause an error when matlab2fmex tries to convert it (matlab2fmex needs every possibly declared variable in the program to be in the save file). Specifically, the following assignments should be made after a typical call to lqmns.m: qm1=1.1;qg0=qf0;qg1=qf1;qh0=qf0;qh1=qf1;qh2=qf2;l=2;q0l=1.1;q1l=1.1;qmk=1.1;km=1.1; Then the workspace should be saved. matlab2fmex_save works by inserting the line: %matlab2femx_save into lqmns.m, then matlab2fmex_save is called with a sample call to the program lqmns.m as a string argument. matlab2fmex_save will then execute lqmns.m and save the workspace for you, optionally calling the lqmns_save.m script (if it exists and just after the %matlab2femx_save insertion or at the end) just before saving. So if we make a script lqmns_save.m which contains: qm1=1.1;qg0=qf0;qg1=qf1;qh0=qf0;qh1=qf1;qh2=qf2;l=2;q0l=1.1;q1l=1.1;qmk=1.1;km=1.1; and add: %matlab2femx_save somewhere in lqmns.m (this line may be omitted if you would put it at the end of the file anyway), and we now run: matlab2fmex_save('lqmns(0,20,1.5,zeros(21,1))') then lqmns.mat will be saved and conversion can proceed. 8. REVISION HISTORY: See changelog