|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Full Text | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PAGE 1 GUERRILLACLUSTERSFORSCIENCE: THEAPPLICATIONOFGENETICALGORITHMSTOSPECTROSCOPY BY NOAHHENRYANDERSON AThesis SubmittedtotheDivisionofNaturalSciences NewCollegeofFlorida inpartialfulllmentoftherequirementsforthedegree BachelorofArts UnderthesponsorshipofProfessorStevenT.Shipman Sarasota,Florida May,2012 PAGE 2 Intalkingabouttheimpactofideasinoneeldonideasinanothereld,oneis alwaysapttomakeafoolofoneself.Inthesedaysofspecializationtherearetoo fewpeoplewhohavesuchadeepunderstandingoftwodepartmentsofourknowledge thattheydonotmakefoolsofthemselvesinoneortheother." |RichardFeynman,TheMeaningofItAll:ThoughtsofaCitizen-Scientist" ii PAGE 3 Acknowledgements Iwouldliketothankmyparentsfortheirenthusiasticsupport.Iwouldalso liketothankProfessorShipmanforhisroleasmyresearchandthesisadvisor{ providingvaluableadvice,reviewingmyweeklydrafts,andguidingmethroughthe entireprocess. FundingformyworkonthisprojectduringtheSummerof2010wasprovidedby theResearchCorporationforScienceAdvancementCottrellCollegeScienceAward. FundingforcloudcomputingwasprovidedbytheNewCollegeofFloridaStudent ResearchandTravelGrantprogram. iii PAGE 4 Contents Acknowledgementsiii Abstractx 1Introduction1 1.1Spectroscopy...............................1 1.2GeneticAlgorithms............................4 1.3VariableNames..............................7 2TheGeneticAlgorithm9 2.1TheGenome................................9 2.2TheFitnessFunction...........................11 2.3TheNextGeneration...........................13 2.4AdditionalStrategies...........................15 3DistributedComputing17 3.1AGuerrillaClusteratNewCollege...................18 3.2PlanningforDistributedComputing..................19 3.2.1DistributedComputingPlatforms................20 3.3ImplementationofDistributedComputing...............21 3.4ComputingintheCloud.........................25 iv PAGE 5 3.4.1IntroductiontoAmazonEC2..................25 3.4.2DeployingtheGeneticAlgorithmonAmazonEC2.......28 4Results31 4.1Background................................32 4.1.1SpectralComplexity.......................33 4.2Suprane..................................36 4.2.1SimulatedSpectraofSuprane..................36 4.2.2ObservedSpectrumofSuprane.................44 4.3Fluorobenzene...............................47 4.4StrawberryAldehyde...........................52 4.5Anisole/Hexanal/OrthouorotolueneMixture.............57 5Conclusions61 5.1FutureWork................................62 ADistributedComputingProtocols64 A.1PortingtheGeneticAlgorithmtoOtherDistributedComputingPlatforms....................................65 A.2SendingWorktotheGuerrillaCluster.................68 A.2.1CommunicationsProtocol....................68 A.2.2TheApp..............................72 BSourceCode75 Bibliography118 v PAGE 6 ListofFigures 1.1Anexampleofenergyleveltransitionsandacorrespondingspectrum2 1.2ExperimentallyobservedspectrumofSupraneat1K.........3 2.1Thecrossoveroperation.........................14 2.2Themutationoperation.........................14 2.3Anexampleofdoubleresonance.....................16 3.1Controlanddataowwithinthedistributor..............23 3.2DistributedComputingClient......................24 3.3AmazonEC2instancepriceperECU-hour...............26 3.4AmazonEC2SpotPricesforFebruary2012..............28 4.1ConformersofSuprane..........................34 4.2Experimentallyobservedspectrumofstrawberryaldehydeat298K.35 4.3Suprane..................................36 4.4SimulatedspectrumofSupraneat2K.................37 4.5SimulatedspectrumofSupraneat50K.................38 4.6SimulatedspectrumofSupraneat298K................39 4.7GeneticalgorithmtnessovertimeforsimulatedspectraofSuprane.40 4.8RotationalparametersovertimeforsimulatedspectraofSuprane..41 4.9ObservedspectrumofSupraneat1K..................45 vi PAGE 7 4.10GeneticalgorithmtnessovertimeforobservedspectrumofSuprane46 4.11RotationalparametersovertimeforobservedspectrumofSuprane..46 4.12Fluorobenzene...............................47 4.13Simulatedspectrumofuorobenzeneat298K.............49 4.14Geneticalgorithmtnessovertimeforsimulatedspectrumofuorobenzenecorrect A ..........................50 4.15Rotationalparametersovertimeforsimulatedspectrumofuorobenzenecorrect A .............................50 4.16Geneticalgorithmtnessovertimeforsimulatedspectrumofuorobenzenemodied A .........................51 4.17Rotationalparametersovertimeforsimulatedspectrumofuorobenzenemodied A ............................51 4.18Strawberryaldehyde...........................52 4.19Observedspectrumofstrawberryaldehydeat2K,conformercisI..54 4.20RotationalparametersovertimeforobservedspectraofcisIandcisII conformersofstrawberryaldehyde....................55 4.21RotationalparametersovertimeforobservedspectraoftransI,transII, andtransIIIconformersofstrawberryaldehyde............56 4.22Anisole,hexanal,andorthouorotoluene................57 4.23Observedspectrumofmixturecomponentorthouorotoluene.....59 4.24Rotationalparametersovertimeforsimulatedspectrumofthemixture60 A.1Anexampleofacongurationstanza..................65 A.2Anexamplelistofindividuals......................66 A.3Anexample ga-spectroscopy-client input/outputcombination..67 A.4Contentsof ga-spectroscopy.app ...................73 vii PAGE 8 A.5Templatefor app.pl ...........................74 viii PAGE 9 ListofTables 1.1Tableofvariables.............................8 3.1Distributionofcomputersamongclassrooms..............18 3.2AmazonEC2instancetypes.......................26 4.1ResultsforsimulatedspectraofSuprane................42 4.2ResultsforsimulatedspectraofSupraneat298Kundervariousgenetic algorithmcongurations.........................43 4.3ResultsforobservedspectrumofSupraneat1K............44 4.4Resultsforsimulatedspectrumofuorobenzeneat298K.......48 4.5Dipolemomentsforstrawberryaldehyde................52 4.6Resultsforsimulatedspectrumofstrawberryaldehyde........53 4.7Dipolemomentsformixturecomponents................58 4.8Resultsforsimulatedspectrumofmixture...............58 ix PAGE 10 GUERRILLACLUSTERSFORSCIENCE: THEAPPLICATIONOFGENETICALGORITHMSTOSPECTROSCOPY NoahHenryAnderson NewCollegeofFlorida,2012 ABSTRACT Fittingmolecularparameterstomicrowavespectraisdicult,especiallyforroom temperaturespectra,whichareverydense.Thecurrentmethodistomanuallymatch peaksandtweakparameters,anextremelytimeconsumingprocess.Becausetheforwardproblemofpredictingaspectrumgivenmolecularparametersismucheasier,a geneticalgorithmispotentiallywell-suitedtoautomatedspectraltting.Agenetic algorithmwasdevelopedtotspectrabyoptimizingtherotationalanddistortion parametersofthemoleculeforthebestmatchbetweenthepredictedandobserved spectra.Thealgorithmwastestedonavarietyofsimulatedandexperimentally observedspectrawithsomesuccess.Theuseofparallel,distributed,andcloudcomputingtorunthegeneticalgorithmfasteralsoposedsomeinterestingchallenges. ProfessorStevenT.Shipman DivisionofNaturalSciences x PAGE 11 Chapter1 Introduction 1.1Spectroscopy Spectroscopyistheuseofelectromagneticradiationtostudythestructureof molecules.Dierentfrequenciescanbeusedtoinduceorinspectdierentkindsof molecularmotion.Forexample,ultravioletspectroscopyanalyzeselectronicmotion, andtheinfraredrangeisusedtoexaminethevibrationalmodesofchemicalbonds. Forthisproject,weareinterestedinmicrowavespectroscopy,withfrequenciesrangingbetween0.3and300GHz.Amolecule'sabsorptionoremissionofmicrowave frequenciesisrelatedtoitsrotationalbehavior. Spectroscopyhasmanyapplications.Forexample,analyticalchemistscanuse spectroscopyforidenticationthroughspectralngerprinting,andphysicalchemists usespectroscopytoinvestigatethepropertiesandcompositionofmaterials.Inthis lattercase,theinformationaboutthemoleculecomesintheformofvariousparameters,suchasmomentsofinertiainthecaseofmicrowavespectroscopy.However, itisnotastraightforwardtasktodeterminethesemolecularparametersfromits spectrum. 1 PAGE 12 Inquantummechanics,particlescanhaveaquantized"energyspectrum.In otherwords,theenergyoftheparticlemayonlytakeondiscretevalues,calledenergy levels.Eachtransitionbetweenenergylevelscorrespondstoapeakinthefrequency spectrumandvice-versa.Thefrequencyofthepeakisgivenbythedierencein energylevels.ThisrelationshipisshowninFigure1.1. 123546 Energy d c b a 5 12 4 6 3 Frequency Intensity Figure1.1:Anexampleofenergyleveltransitionsandacorrespondingspectrum. Peakintensitiesareillustrativeonly. Forexample,let x bethedierencebetweenenergylevels c and d .Itisclear thattransitions2and3dierby x ,asdotransitions4and5.Now,considerthe correspondingpeaksinthespectrum.Thefrequenciesofpeaks2and3dierbythe sameamountasthefrequenciesofpeaks4and5. Theenergylevelsarereferredtousingquantumnumbersintheform J K A K C where J isthelengthofthequantizedangularmomentumvector,and K A and K C arelikeprojectionsof J ontotwodimensions.Thus,anytransitioncanbedescribed bythequantumnumbersoftheinitialandnalenergylevels. AnactualspectrumisshowninFigure1.2.ThespectrumisofSuprane,amolecule thatisdiscussedinSection4.2. Thereexistsoftwareprograms,suchasSPCAT[1],thatcanbeusedtoproducea 2 PAGE 13 0 0.0005 0.001 0.0015 0.002 0.0025 0.003 9000100001100012000130001400015000160001700018000 Intensityarbitraryunits FrequencyMHz Figure1.2:ExperimentallyobservedspectrumofSupraneat1K simulatedspectrumgivenvaluesofthevariousparametersofthemolecule.SPCAT operatesbycomputinganumericsolutiontothedierentialequationsthatdescribe therotationalbehaviorofthemolecule.Theeigenvaluesofthematrixformofthis systemofequationsprovideinformationontheenergyleveltransitions.Thespectrum isthenproducedbytakingthedierenceoftheseeigenvalues.Thisiswhytheinverse problem{ndingtheparametersgiventhespectrum{issodicult:itishardto ndasetofeigenvaluesthatsatisfythegivensetofdierences. Currentpracticeistotspectrabyhand,amanualprocessofpeakmatchingand parametertweakingthatisiterateduntilasucientlygoodthasbeenobtained. Obviously,thisprocessisextremelytimeconsuming.Automatedttinghasthe potentialtoallowscientiststopursueresearchactivitiesthataremoreinteresting thanparametertweaking. Mostmicrowavespectroscopyoccursattemperaturesnearabsolutezero,although forvariousreasonsafewresearchers,includingProfessorShipman,operatenearroom 3 PAGE 14 temperature 298Kelvin.However,thenumberofpeaksinaspectrumscaleswith temperature,approximatelyby T 3 2 { T 5 2 .Roomtemperaturespectrathereforehave averylargenumberofpeaks,whichmakesthettingprocessfarmorecomplicated thanforlowtemperaturespectra. Apromisingtechniqueforautomatedttingistouseageneticalgorithmtodetermineparametersthatproducethespectrummostsimilartotheobservedspectrum wearetryingtot.SimilaritycanbeevaluatedbyusingSPCATtopredictaspectrumforthegivenparametersandthencomparingtheresultingspectra.Thegenetic algorithmapproachisthesubjectofthisthesis. Thisisnotanewidea.W.LeoMeertsandMichaelSchmittarealsoworkingon ttingmicrowavespectrausinggeneticalgorithms[2][3].However,theirworkoperates onmuchlowerresolutionspectraMHzpeakwidththanthedataconsideredhere whichhasapeakwidthofupto500kHz. AnotherapproachtoautomatedttingwasdevelopedbyIanFinneranforhis thesisresearchatNewCollege[4]incooperationwiththeUniversityofVirginia.This ismoreofabrute-force"approach,andoperatesverydierentlyfromagenetic algorithm,tendingtotthespectrausingasimilarapproachasahuman.Itoften producesusefulresults,butrequiresamuchbetterinitialguessthanthegenetic algorithm. 1.2GeneticAlgorithms Ageneticalgorithm[5]isanevolutionaryoptimizationalgorithm,whichseeksto determinethebestsolutiontosomeproblembyrandomlygeneratingalargenumber ofpotentialsolutionsandtestingtheirtness.Itdoesthisbymaintainingapopulationofindividuals,witheachindividualconsistingofagenomethatrepresentsa 4 PAGE 15 possiblesolutiontotheproblem.Thesizeofthepopulationremainsxedforthe durationofthealgorithm. Oneverygeneration,eachindividualisevaluatedbysomeproblem-specictness function.Thetnessfunctionwilldecodethepotentialsolutionfromthegenome, useitastheinputtosomecomputation,andthenproducearealnumberwhich representshowclosethepotentialsolutionistothecorrectanswer.Afterthetness hasbeencomputedforalloftheindividualsinthepopulation,anewgenerationof individualsiscreatedfromthepreviouspopulationbyapplyingDarwinianstrategies. Thistakestheformofgeneticoperators,whichareusedtochooseexistingindividuals, combinethemtoproducenewindividuals,andmakerandommutationstoformthe nalmembersofthenewpopulation. Onepopulartechniqueissingle-pointcrossover,whereasplitpointischosenon thegenomesofapairofindividuals,andthedataononepartofthegenomeis swappedbetweenindividuals.Therearemanyvariationsonthecrossoveroperator, includingtwo-pointcrossovertheportionbetweenthesplitpointsisswapped,uniformcrossovermultiplecrossoverpoints,chosensothatthenumberofbitsfrom eachparenthasaconstantratio,andmulti-parentcrossoverthevalueofagiven bitinthechildisthemajorityvalueofthebitinallparents.Inallcases,the crossoveroperationaimstocombinefeaturesofexistingindividualsthathave,on average,relativelyhightness. Randommutationsarealsoappliedwithlowprobability,whereoneormorebits ofthegenomearerandomlychangedtoanothervalue.Mutationallowsthegenetic algorithmsomeexibilitytowandertheentirespaceofpossiblesolutions.Most importantly,italsohelpskeepthegeneticalgorithmfromgettingtrappedinalocal maximum. Afterproducingthenewpopulation,anewgenerationbeginswiththeevaluation 5 PAGE 16 ofthetnessoftheindividuals.Theprocesscontinuesforthedesirednumberof generations;manygenerationsmayberequiredtodetermineasatisfactorysolution. Geneticalgorithmshavebeenemployedtosolvemanyproblemsthroughoutthe realmofscienceandtechnology.Onemajorapplicationisthetravelingsalesman problem,wheregivenasetofcities,thegoalistondtheshortestroutethatvisitsall ofthemexactlyonceeach.Geneticalgorithmscanbeusedtoapproximatetheoptimal solutiontowithin0.5%inareasonableamountoftime[6].Thetravelingsalesman problemitselfhasimportantapplications,evenbeyondobviouslogisticproblems,such astheoptimizationofthedesignandmanufactureofprintedcircuitboards[7]and microchips.Anotherapplicationisgeneticprogramming[8],whereageneticalgorithm evolvesacomputerprogramoralgorithmtosolvesomeproblem.Currently,genetic programmingremainsanactiveresearcharea. Therearetwokeyquestionstoconsiderwhendevelopingageneticalgorithm tosolveaproblem.First,thereisthematterofencodingthesolutionspaceinthe genome.Whilemanyproblemscanbeconciselyrepresentedasahandfulofintegersor oating-pointnumbers,othersrequiremorecomplicatedstructures.Forexample,the travelingsalesmanproblemrequiresthegenometocontainanorderingofthecities, andgeneticprogrammingdictatesthatthegenomeencodeasequenceofinstructions, whichmaybeofvariablelength.Someproblemsmayalsobenetfromspecialized geneticoperators;forexample,aswap-cityoperationmaybeusefulforagenetic algorithmsolvingthetravelingsalesmanproblem. Theotherfundamentalquestionishowtoevaluatethetnessoftheindividual. Somemetricmustbedevelopedtodeterminehowclosetheproposedsolutionistothe unknownexactsolution.Notably,thismustbeacontinuousmeasure;asimpleyesor-noanswerisnotveryhelpful,andwillreducethegeneticalgorithmtoarandom search.Inaddition,becausethetnessiscomputedmillionsoftimesforeachrunof 6 PAGE 17 thegeneticalgorithm,itisvitalfortheevaluationtobequick. 1.3VariableNames ThevariablenamesusedinthisthesisaretabulatedinTable1.1.Geneticalgorithmcommand-lineparametersarenoted,whereapplicable. 7 PAGE 18 VariableMeaning A B C D J D JK D K d J d K SPCATparameterunderoptimizationMinimumandmaximumvalue ofeachparameteriscontrolledvia --amin --amax ,..., --djmin ,..., --deljmin ,... B 0 Numberofbins --bins E Elitism --elitism e 0 Decayrateforuncertaintypropagation,whichisenabledifnonzero --errordecay f i Fitnessofthe i thindividualofthepopulation F Scalingfactortousefordynamicmutationcomputations --dynamic-mutation-factor F min F max Minimumandmaximumfrequencyunderconsideration --rangemin --rangemax g Currentgenerationnumber g 2 Z + [f 0 g i;j;k;l;m;n Scratchvariables J K A K C QuantumnumbersSection1.1 J I i;j J F i;j The J quantumnumberfortheinitial J I ornal J F energylevelof the j thpeakinthe i thbin M Mutationprobability --mutationrate M min M range Mutationraterangefordynamicmutation --dynamic-mutation-min dynamic-mutation-range M 2 [ M min ;M min + M range ] [0 ; 1]. P Populationsize --population T Temperature,inKelvin W Weightofintensityvs.peakcountintnessfunction --weight W 0 Mutationweight,toadjustmutationprobabilitywithrespecttobit signicance --mutationweight W Numberofgenerationsofhistorytousefordynamicmutation --dynamic-mutation-width x y z Scratchvariables g Dynamicmutationvaluestoredforgeneration g Table1.1:Tableofvariables 8 PAGE 19 Chapter2 TheGeneticAlgorithm Themainfocusofthisresearchistoproduceageneticalgorithmthatcompares predictedspectrawithanobservedspectrum,optimizingtheparametersofthepredictiontobestmatchtheobservation. ThegeneticalgorithmusesSPCAT[1]topredictaspectrumgivenasetofparameters.Whenprovidedwithapairofinputlesdescribingtheparameters,SPCAT willproduceanoutputlecontainingthelistofpredictedpeaks.ThegeneticalgorithminterfaceswithSPCATbycreatingtheinputlesinatemporarydirectory, runningSPCAT,andparsingtheresultingoutputle.Theinputlesaregenerated byinsertingthecurrentvalueoftheparametersintoauser-providedtemplate. 2.1TheGenome WevaryeightoftheprimaryparametersoftheSPCATprogram.Therstthree aretherotationalparameters A B ,and C ,whichrepresentthemolecule'smoments ofinertiaandsatisfy A B C> 0.Thenextvearethedistortionparameters D J D JK D K d J ,and d K ,whichrepresentthecentrifugaldistortionofthemolecule.We 9 PAGE 20 thereforechooseagenomethatisdividedintoeightsegments,oneforeachparameter. Theencodingoftheseparametersposessomediculty,aseachoneisareal number.Inordertoenablesensiblebehaviorformutationandcrossover,thesegments areencodedasanunsigned32-bitinteger.Theparametersarerepresentedasxed pointnumbers,withtherotationalparametersstoredas x 10 5 andthedistortion parameterswhichmaybenegativerepresentedas x 10 12 +2 31 )]TJ/F15 11.9552 Tf 10.692 0 Td [(1.Theosetof 2 31 )]TJ/F15 11.9552 Tf 10.475 0 Td [(1isusedbecauseitisthemidpointof 32 ; 0],therangeofintegersrepresentable in32bits. Finally,consideracasewhere2 k isabettersolutionthan2 k )]TJ/F15 11.9552 Tf 11.594 0 Td [(1.Asanexample, let k =3,so2 k =8=1000 2 and2 k )]TJ/F15 11.9552 Tf 12.77 0 Td [(1=7=0111 2 .Inordertoimprovethe proposedsolutionbymutatingfrom2 k )]TJ/F15 11.9552 Tf 12.707 0 Td [(1to2 k ,itisnecessaryfor k +1bitsto changesimultaneously.Becauseoftheextremeunlikelihoodofsuchamutationthe probabilityis M k +1 ,thegeneticalgorithmwillbecomestuckataHammingwall." Inordertoeliminatethisartifactfromthemutationalgorithm,therepresentationof parametervaluesinthegenomeisencodedusingGraycode[9].Graycodeisaway toencodeintegersintobinarysuchthatconsecutiveintegersdierinonlyonebit.It isdenedas Encode: b 7! b b 1 Decode: g 7! g g 1 g 2 g 4 g n= 2 where istheexclusiveoroperator, isthebitwiserightshiftoperator,and n is thenumberofbits.Forexample,8=1100 G and7=0100 G .Theseoptimizations allowthegeneticalgorithmtoeasilymovebysmallamountsbyeliminatingarticial boundariestomutationneararbitraryvalues. 10 PAGE 21 2.2TheFitnessFunction Toevaluatethetnessofanindividual,thegeneticalgorithmrstusesSPCAT tocomputethepredictedspectrumfromtheparametersencodedinthegenome.The tnessfunctionthencomputesthedierencebetweenthepredictedspectrumandthe observedspectrum.Thisdierenceisusedasthetnessvalue.Peakintensitiesare onanarbitraryscale,andabsolueintensityishighlyvariable,duetoanumberof factors,includingtemperatureandexperimentalinuencessuchascontaminantsand samplemass.Whenloadingthespectra,bothpredictedandobserved,thegenetic algorithmscalestheintensitiestotherange[0 ; 1],preservingrelativevalues. Thepeaksintheobservedandpredictedspectraarethendividedinto B 0 equal binsoverthefrequencyinterval[ F min ;F max ].Thegeneticalgorithmalsosupportsa modewherethenumberofbinsisrandomizedoneachgeneration,inordertoavoid situationswherethechoiceofbinsizemayadverselyaectconvergence.Thiscould beanissueincaseswhereaprominentpeakislocatednearaboundarybetweentwo bins,andthematchingpeaksinthegeneticalgorithm'sproposedsolutionsareonthe wrongsideoftheboundaryfrequency. Ifuncertaintypropagationisenabled,uncertaintyvaluesforthemolecularparameterswillbeprovidedtoSPCAT,andwillbepropagatedtofrequencyuncertainties foreachpeakinthepredictedspectrum.Eachgeneticalgorithmparametercanbe restrictedtoacertainrangebasedonaninitialguessofknownuncertainty.The initialuncertaintyprovidedtoSPCATbythegeneticalgorithmforeachparameter isone-quarteroftherangeforthatparameter,anditdegradesbyafactorof e 0 for eachgeneration.Thegeneticalgorithmthencanthenndthe25peakswithleast uncertaintyand,if X isthesetofuncertaintiesforthesepeaks,computeanerror 11 PAGE 22 tolerance: e max =max 10 F max )]TJ/F18 11.9552 Tf 11.955 0 Td [(F min B 0 [ X Thegeneticalgorithmwillthenignorepredictedpeakswithanuncertaintygreater than e max whileitisdividingpeaksintobins.Thegeneticalgorithmwilltrackthe totaluncertaintyofthepeaksineachbin. Thetotalintensityandnumberofpeaksaretrackedforeachbin,alongwitha weightingfactorbasedonthequantumnumbersofthepeaks: n i X j =1 s 2 J I i;j + J F i;j Where J I i;j and J F i;j representthe J quantumnumberoftheinitialandnal energylevels,respectively,ofthe j thpeakinthe i thbin.Thisfactorincreasesthe weightonthetnessofpeakswithlow J values.Thesepeaksaremostimportantfor ttingthespectrum. Thenaltnessoftheindividualisthencomputedaccordingtothisequation: F = )]TJ/F15 11.9552 Tf 9.298 0 Td [(1000 B 0 X i =1 )]TJ/F18 11.9552 Tf 5.48 -9.683 Td [(W j x i )]TJ/F18 11.9552 Tf 11.955 0 Td [(y i j 2 + )]TJ/F18 11.9552 Tf 11.955 0 Td [(W j m i )]TJ/F18 11.9552 Tf 11.955 0 Td [(n i j 2 e i n i X j =1 s 2 J I i;j + J F i;j !! Where x i and y i arethetotalintensityofthepeaksinthe i thbinoftheobserved andthepredictedspectra,respectively,and m i and n i arethenumberofpeaksinthe same. e i isthetotaluncertaintyofthefrequencyofthepredictedpeaksinthe i th bin,ifuncertaintypropagationisenabled. Afterdeterminingthetnessofeachindividual,theyaresortedbytnessto determinethetopcandidates. 12 PAGE 23 2.3TheNextGeneration Thenewgenerationiscreatedthroughthecombinationofavarietyofmethods: elitism,crossover,andmutation. First,the E individualswithhighesttnessofthepreviousgenerationcarryover tothenewgenerationwithoutmodication.Thiselitismpreservesthemostpromising candidatesolutionsforfuturegenerationsandguardsagainstthepossibilityofthe crossoverandmutationoperationsproducingworseindividuals.Thedefaultvalue for E is p P ,butcanbeadjustedforeachrun. Topopulatetheremainderofthegeneration,westartbybyselectingpairsof individualsusingroulette-wheelselection.Intheroulette-wheelselectionalgorithm, theprobabilityofagivenindividualbeingchosenisdirectlyproportionaltoitstness. Thealgorithmchoosesanindividualbygeneratingarandomnumber x 2 0 ; P X i =0 f 0 i andthendeterminingtheleast k suchthat x< k X i =0 f 0 i where f 0 i isthetnessofthe i thindividual,scaledtotherange[0 ; 1].Inthiswaythe algorithmchoosesthe k thindividualofthepopulation. Foreachpairofchosenindividuals,onecrossoverpointisrandomlychosenwith uniformdistributionfromwithineachsegment,whereeachsegmentrepresentsasingle SPCATparameter.Therstorsecondportionofeachsegmentisthenswapped betweenthetwoindividuals,againwithbothchoiceshavingequalprobability.The 13 PAGE 24 crossoveroperationisillustratedinFigure2.1. aBefore = bAfter Figure2.1:Thesingle-pointcrossoveroperation.Inthisexample,thesecondportion ofthesegmentisbeingswapped. Finally,anysubsetofthebitsinthegenomeoftheindividualsmaybetoggled withsomeprobability.Themutationprobabilityofthe i thmostsignicantbitofa 32-bitsegmentis p i = M i 32 W 0 .Thisprovidestheoptionforthegeneticalgorithm topreferentiallymutatethelesssignicantbitsofthesegment.Theeectofthe mutationoperatorisshowninFigure2.2. 01110010 aBefore = 00111110 bAfter Figure2.2:Themutationoperation.Inthisexample,twobitsarebeingtoggled. Inaddition,thegeneticalgorithmhasadynamicmutationfeature,whichvaries themutationrate M overthedurationoftherun.Inthismode,themutationrate riseswhenevertheaveragetnessofthepopulationisstagnating.Thefeaturecan betunedviaseveralparameterstothegeneticalgorithm: F aweightingfactor, W thenumberofgenerationsofhistory, M min theminimummutationrate,and M range therange.Thisallowsthegeneticalgorithmtotrytomakelargerchanges ifsmallchangesarenothavinganyeect.Thedynamicmutationfeatureoperates 14 PAGE 25 bycomputingthefollowingvalueafterthe i thgeneration: Dyn g = Dyn g )]TJ/F16 7.9701 Tf 6.586 0 Td [(1 F )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 F + P X i =0 f i P Let k = g )]TJ/F18 11.9552 Tf 11.955 0 Td [(W .Themutationrateisthensetasfollows: M = M min + e )]TJ/F23 5.9776 Tf 7.782 4.689 Td [(j g )]TJ/F20 5.9776 Tf 5.757 0 Td [( k j F M range Adefaultmutationrateisusedif g PAGE 26 transition,whichsharesthe12 2 ; 10 energylevelwiththepulsedpeak. v =0 v =1 v =2 DRODRO DRO MWSignalarb.units 13700 13750 13800 13850 100 50 0 -50 -100 100 50 0 -50 -100 100 50 0 -50 -100 FrequencyMHzFrequencyMHzFrequencyMHz 13700137501380013850 13700137501380013850 Pulse:15735.00MHz Modulate:13720.66MHz DROnDROn DROn Pulse:15806.50MHz Modulate:13773.44MHz Pulse:15875.03MHz Modulate:13826.86MHz Figure2.3:Anexampleofdoubleresonance.Adaptedfrom[10]. Knowndoubleresonancesmaybeprovidedtothegeneticalgorithmasalistof pulsedpeaksfrequencies,eachwithalistoflinkedpeaks.Ifdoubleresonancesare provided,thetnessfunctionwillverifythemafterloadingthepredictedspectrum. Foreachdoubleresonance,thetnessfunctionwillcheckthatsometripletofquantum numbersappearsonsomepeakneareveryexpectedpeakfrequency.Iftheindividual doesnotsatisfyalloftheexpecteddoubleresonances,theindividualwillbediscarded andreplaced. Thedoubleresonancefeatureisnotusedatpresent,asitmakesthegeneticalgorithmtakefarlongertorunanddoesnotprovidemuchimprovementinconvergence. 16 PAGE 27 Chapter3 DistributedComputing Becauseitmayhavetoevaluatealargenumberofindividuals,thegeneticalgorithmcanrequireavastamountofcomputation.However,becauseeachindividual isevaluatedindependentlyofanyother,multipleindividualscanbeevaluatedin parallel.Thisallowsthegeneticalgorithmtotakeadvantageofmodernmulticore processorstoachievesubstantialtimesavings.Furthermore,becauseofthelimited amountofdatarequiredtoevaluateanindividual{justahandfulofnumbers{the geneticalgorithmcanbeeasilyparallelizedacrossmultiplecomputers. Theuseofdistributedcomputinghasallowedthegeneticalgorithmtorunwith muchlargerpopulationsthanwouldbefeasibleonasinglecomputer.Experimentationwithpopulationsizeshasdemonstratedthat16,384individualsisagoodchoice forthisproblem.However,apopulationofthissizewouldtakeaverylongtimeto compute,evenonadual-corecomputer.Largercomputationsareperformedona computerclusteratNewCollege,whichwasexpandedtoincludecloudcomputers fromAmazonWebServices. 17 PAGE 28 3.1AGuerrillaClusteratNewCollege NewCollegedoesnotcurrentlyhaveanyfacilitiesforhigh-performancecomputing.However,therearemanycomputersontheNewCollegecampus,manyofwhich areidleformostoftheday.Itthereforeseemedlogicaltoharvestthisunusedcapacity forrunningthegeneticalgorithmbyunitingthecomputersintoaguerrilla"cluster. Thecomputersinseveralclassroomsandcomputerlabswereselectedtobepart ofthecluster,andarelistedinTable3.1.ThecomputersintheDivisionofNatural SciencescomputerlabHNS108werenaturalchoicesforthecluster,alongwiththe computerintheMathReadingRoomHNS106.Inaddition,ProfessorShipman volunteeredthecomputersinhislabHNS211.Thecomputersinlibraryclassrooms LBR152{156and248{252werealsoselected,astheywereespeciallyfast.However, theprimaryfunctionofthesecomputerswasforpresentationsandotherclassroom activities.Toavoidinterferingwithanyteaching,IsubstitutedtheevenbettercomputersoftheACE329computerlabuponcompletionofthenewAcademicCenter. Finally,theunderutilizedLBR209computerlabwasoccasionallyusedtoprovide additionalcapacitytothecluster. RoomComputersCurrentUsage ACE32924Yes;Upto24inregularuse HNS1061Occasionaluse HNS10815Yes;Upto12inregularuse HNS2116Yes;All6inregularuse LBR152{156,248{2526SupplantedbyACE329 LBR20918Occasionaluse Total7042inregularuse Table3.1:Distributionofcomputersamongclassrooms 18 PAGE 29 3.2PlanningforDistributedComputing TheGuerrillaClusterplanposedseveralunusualrequirementsandorganizational challenges.Oneofthemainchallengeswasthatofsharingthecomputerswithlocal users.Ontheonehand,thedistributedcomputingclientmustavoidinterferingwith theintendedpurposeofthecomputer,includingclasspresentationsandscientic computing.Thisiseasilysolvedbyrunningthegeneticalgorithminsuchawaythat theoperatingsystemconsidersittobealowprioritytaskandallowstheuser'swork totakeprecedence.Ontheotherhand,experiencehasshownthatwhenpresented withawindowthatsayspleasedonotclosethiswindow,"thereisanontrivial probabilitythattheuserwillreactbyclosingthewindowimmediately. 1 Sharing goesbothways,sotheclientshouldbeabletocontinuerunningasaniconinthe systemnoticationareaevenafterthestatuswindowhasbeenclosed,adoptingthe samebehaviorasmostinstantmessengerprograms.Ofcourse,itisstillpossibleto quitthesoftwarebychoosingExitfromtheFilemenu. Thisisnottheonlydicultyplaguingthecluster.Shouldacomputerbeshut downorrestarted,thedistributedcomputingclientwillstoprunning.Theseactions areregularlyinvokedbythelocaluser,operatingsystemservices,orpoweroutages. Becausethesedisturbancesareactuallyquitefrequent,itisimportantthatthedistributedcomputingplatformtolerateadynamiccluster,andpermitworkerstoleave orjoinatanytime. Inaddition,theinitialdeploymentoftheclusterplannedforthedistributedcomputingservertobelocatedoutsideofNewCollege.Assuch,theplatformneededto tolerateanetworktopologythatplacestheworkersbehindarewallthatprevents directaccesstotheworkersfromtheserver.Thisconsiderationwaseventuallyobso1 Nostudyhasbeenmadeontheimpactofahypnoticprogressbaronthebehavioroflocalusers. 19 PAGE 30 leted,however,astheserverwaseventuallymovedinsidetherewall.Unfortunately, whentheclusterwasexpandedtoincludecloudcomputerssection3.4,theserver's newlocationrequiredasolutiontotheoppositeproblem{arewallthatprevents theclientsfromdirectlyreachingtheserver. Finally,thenatureofthegeneticalgorithmimposessomeadditionalconstraints. Mostimportantly,thegeneration-basedstructureoftheproblemrequiresthateach generationbecompletedpriortostartingthenext.Inaddition,theworkrequiredfor thegeneticalgorithmistinycomparedtomanycomputingprojectsthatuseclusters{ workunitsaremeasuredinseconds,ratherthanminutes,hours,ordays.Thismeans thatthedistributedcomputingplatformmustperformwellwithsmalljobsthatneed tobecomputedquickly. 3.2.1DistributedComputingPlatforms Theuseofidlecomputerstoworkonlargecomputationalprojectsisnotuncommon.Forexample,SETI@home[11]andFolding@home[12]arepopularprojects poweredbyaworldwidenetworkofvolunteers.Similarprojectsoperateatsmaller campus-levelscales,suchastheCERNCompactPhysicsScreensaver[13].Thisidletimeusagemodelsharedbyvolunteer-baseddistributedcomputingandtheGuerrillaClustermeansthatthecomputersarenotdedicatedtotheprojectandmay stopworkingonitatanytime.However,smallworkunitsareveryuncommonin volunteer-baseddistributedcomputingprojects,withmostsuchprojectsoperating onworkunitsthatrequireatleastanhour,andsometimesuptoamonth,tonish. Forvolunteer-baseddistributedcomputing,someprojectshavedevelopedcustomizedplatforms,includingFolding@homeandtheGreatInternetMersennePrime Search[14].Mostprojects,however,useBOINCBerkeleyOpenInfrastructurefor NetworkComputing[15].Unfortunately,forthepurposesofthegeneticalgorithm, 20 PAGE 31 BOINCwasfoundtobeapoorchoice,asitwasclearlydesignedformuchlarger workunits.Mostnotably,ininitialtests,everyworkunitreturnedtotheserverwas subjectedtoanunexplained30-seconddelaybeforeitwasmarkedascompleted.The BOINCserveralsohadotherineciencies,andingeneralwasmorecomplexand muchlargerthanwasdesired.Arelatedproject,calledRT-BOINC[16],hadmodied theBOINCservertobettersupportreal-timecomputations.Thisseemedpromising, butwasonlyaprototype,andwouldhaverequiredsubstantiallymoretimethanthe originalBOINCimplementationtosetupandmaintain. Severalotherplatformswereconsidered,however,noneofthesesupportedall oftheconstraintsimposedbytheGuerrillaCluster.Specically,MPIseemedto havepoorsupportfordynamicclustercompositionandheterogeneousworkers,and Condorappearedpoorlysuitedtosmalljobs.Asaresult,Idecidedtodevelopmy ownclusterplatform. 3.3ImplementationofDistributedComputing Modicationofthegeneticalgorithmtosupportdistributedcomputingwasreasonablystraightforward.Thegeneticalgorithm'smulti-threadsupportisbasedona producer-consumermodel,wherethemainthreadtheproducerpushesalloftheindividualsintoaqueueandtheworkerthreadsconsumerspullitemsfromthequeue andprocessthem.Theworkerthreadswerereplacedwithasingleconsumerthat simplypullsalloftheindividualsandconvertsthemtoatextualrepresentationthat iseasytotransportoverthenetwork.Smallchunksofthedatacanthenbecombined withthecongurationofthegeneticalgorithmtoformaworkunitthatcanberun onanycomputer. Thegeneticalgorithmwasthenmodiedsothatitcouldbecompiledinaclient" 21 PAGE 32 mode.Inthismode,itsimplyreadsaninputlecontainingcongurationinformation andasetofindividualsandthenevaluatesthetnessofeachone,storingtheresult inanoutputle.Thegeneticalgorithmclientdoesnotincludeanypartofanactual geneticalgorithmexceptforthetnessfunction.Thismakesiteasiertoporttoother operatingsystems,includingWindows,whichisusedonmostofthecomputersin theGuerrillaCluster. Thedistributedcomputingplatformwasdesignedtohavethreeparts:thedistributor,theserver,andtheclient. Thedistributorisresponsibleforreceivingindividualsfromthegeneticalgorithm, assemblingthemintoworkunits,andsendingthemtotheserver.Whenaworkunit completessuccessfully,thedistributorforwardstheresultsbacktothegeneticalgorithm.Ifaworkunitisnotsuccessful,thedistributormustretrytheindividualsin anotherworkunit.ThisoperationofthedistributorisdiagrammedinFigure3.1.The distributoristhebridgebetweenthegeneticalgorithmandthedistributedcomputing server,andwasdesignedasaseparatecomponentfromthegeneticalgorithminorder toallowittobeimplementedinPerl,ratherthanC. 2 Thedistributedcomputing platformcanbeusedforsolvingotherproblemsbyimplementinganappropriatedistributor.Implementersareencouragedtointegratethedistributorwiththeirproblem implementation. Theservermanagesworkersandworkunits,andbridgesbetweenthedistributor andclient.Foreachworkunit,thedistributorwillasktheserverforaworkerand thenprovidetheserverwithaworkunitpackagedforthatworkerforexample,the sizeoftheworkunitmayvarytoaccountforthespeedoftheworker.Theserver thenforwardstheworkunittotheclient,andcontinuestotrackitsprogress.When theclientnishestheworkunit,oriftheclienttakestoolongorstopsworking,the 2 Cisanelanguage,buttextprocessingisnotoneofitsfortes. 22 PAGE 33 GeneticAlgorithm Receiveindividuals Store individuals Requestworker Chooseindividuals andbuildworkunit Sendworkunit toserver ForwardresultstoGA GAhassent individuals Serverhas sentresults Yes No Workunit successful? Delete individuals Retry Markas unsent Anyunsent individuals No Yes Selectindividuals andmarkassent Receiveresultsfromserver Yes Individuals Yes Yes No No No Allindividuals evaluated AllowGAtocontinue Server Dataaccess Interprocessmessage Controlow Figure3.1:Controlanddataowwithinthedistributor 23 PAGE 34 serverwillnotifythedistributorandforwardanyresultdata.Theserveralsoprovides supportformonitoringthestatusofthecluster. Figure3.2:DistributedComputingClient Theclientisinstalledoneach worker.Itcanberunwitheither auser-resistantgraphicalinterface Figure3.2oracommand-lineinterface.Theclientsimplydownloadsworkunitsfromtheserver,executesthem,andreturnstheresult. Italsoreportsstatusinformationto theserver,andtothelocaluser. Thereisalsoaproblem-specicportionoftheclient,calledtheapp,"whichisdownloadedaspartoftheworkunitand containsthesoftwareforsolvingthespecicproblem. Forclustersconsistingofadiversemixofoperatingsystemsandprocessorarchitectures,portabilitybecomesaconcern.TheGuerrillaClustermostlyconsistsof LinuxandWindowscomputersonx86andx86 64,the32-and64-bitversionsofthe standardPCarchitecture,respectively.Iftheappincludescompiledsoftware,itmust beprovidedforeachoperatingsystemandarchitecturecombinationpresentinthe cluster,althoughtherearesomeexceptions.Forexample,x86 64operatingsystems haveacompatibilitylayerthatallowsthemtorunx86applications,andFreeBSD hasacompatibilitylayerforrunningLinuxapplications.Theappforthegenetic algorithmincludesthegeneticalgorithmclientandSPCAT.Currently,SPCATis providedinallfourcombinationsLinuxandWindowsonx86andx86 64,butthe geneticalgorithmclientisprovidedonlyasa32-bitprogram,so64-bitoperating systemsmustrunitincompatibilitymode. 24 PAGE 35 3.4ComputingintheCloud TheNewCollegeclusterhasbeenaugmentedbyremotecomputersleasedfrom AmazonWebServicesAWS.AWSprovidesavarietyofpopularcloudcomputing" servicesatlowusage-basedrates,includingAmazonSimpleStorageServiceS3and AmazonElasticComputeCloudEC2.AmazonS3allowsuserstoinexpensively storelargeamountsofdata,althoughthisservicewasnotneededforthegenetic algorithmasitonlyoperatesonatinyamountofdata.AmazonEC2providesusers withremotecomputersinstances,whichcanbeusedforanycomputationaltaskat anhourlyrate.Anewinstancecanstartupandjointheclusterwithinminutes. ThissectionincludesdiscussionofAWSpricesandspecications.ThisinformationissubjecttochangeatAmazon'ssolediscretion,butbelievedtobecorrectasof 6March2012.Exceptwhereotherwisespecied,thissectiondiscussesOn-Demand Linux/UNIXinstancesinasinglearbitraryavailabilityzoneoftheU.S.EastNorthernVirginiaregion.Availabilityzonenamesarerandomizedforeachuser.All pricesareprovidedinU.S.cents. 3.4.1IntroductiontoAmazonEC2 AmazonEC2oersinstancesinavarietyofsizes,whereeachsizeincludesa certainamountofmemoryandCPUcapacity.Instanceswithadditionalfeatures, suchasGPUs,arealsoavailable,butarenotusablebythegeneticalgorithmand arenotdiscussedinthissection.Thepricesandspecicationsofrelevantinstance sizesarelistedinTable3.2.AmazonmeasuresCPUcapacityinEC2Compute Units"ECUs,where1ECUprovidestheequivalentcapacityofa20071.0{1.2GHz IntelOpteronprocessor[17].Theinstancesizesaregroupedintoseveralclasses,each withadierentpriceperECU-hour.Forlargeamountsofcomputation,High-CPU 25 PAGE 36 instancesaremostecient,butthesmallestHigh-CPUinstanceoersonly5ECUs, makingmicroinstancesmorecost-ecientfortasksthatrequirethreeECUsorfewer. ThiseectisshowninFigure3.3,whichshowsthepriceforagivennumberofECUs foreachclassofinstancesizes. InstanceSizeECUsMemory g /hr g /ECU-hr StandardSmall11.7GB8.0 8.0 Medium23.75GB16.0 Large47.5GB32.0 ExtraLarge815.0GB64.0 Micro x 2 : 0: x 0 : 35613MB2.0 5 : 7 High-CPUMedium51.7GB16.5 3.3 ExtraLarge207.0GB66.0 Table3.2:AmazonEC2Standard,Micro,andHigh-CPUinstancetypes 0 25 50 75 100 012345678910 Centsperhour DesirednumberofECUs StandardInstanceinmultiplesof1ECU MicroInstanceinmultiplesof.35ECUs High-CPUInstanceinmultiplesof5ECUs Figure3.3:AmazonEC2instancepriceperECU-hour 26 PAGE 37 AnAmazonMachineImageAMIdenestheoperatingsystem,includedsoftware,andotherinitialstateoftheinstance.AwidevarietyofAMIsareavailable tochoosefrom,featuringavarietyofLinuxdistributionsincludingAmazonLinux, Ubuntu,andDebianandpreconguredsoftwareforexample,awebserverora databaseengine.AMIsarealsoavailablethatrunMicrosoftWindowsServerata slightlyhigherhourlyratetocoverthecostsofthesoftwarelicense. Finally,therearemultipleinstancepurchasingoptions.InadditiontothestandardOn-DemandInstance",AmazonEC2alsooersSpotInstances,"whichallow onetobidonAmazon'sunusedcapacity.SpotInstancesareusuallydiscountedto just30{33.75%oftheOn-Demandprice,althoughtheSpotPricevarieswithsupply anddemandandoccasionallymayevencostmorethananOn-Demandinstance.The spotpricesforselectedinstancesizesoverFebruary2012isshowninFigure3.4.A SpotInstanceischargedthecurrentSpotPricewhileitisrunning,andwillautomaticallyshutdownifthepriceexceedsthemaximumbid.SpotInstancesareidealfor useintheGuerrillaClusterbecauseitispossibletotakeadvantageofthesubstantial savingswithminimalinconvenience,astheclusterisnotaectedbyworkersthat shutdownunexpectedly. AmazonWebServicesalsoprovidesafreeusageallowancefortherstyear.This includes750hoursofamicroinstancerunningLinuxand750hoursofamicroinstancerunningWindows,alongwithvariousstorageanddatatransferallowances. Thisallowedasubstantialamountoftestingtooccurwithoutcharge.However,microinstancesdonotoperateatasteadyspeed.Rather,thespeedvaries,uptoan advertisedmaximumof2.0ECUs.However,theocialdocumentationisquitevague abouthowfastamicroinstanceactuallyis.Inperformancecomparisonsofamicro instanceandastandardsmallinstance,ithasbeenfoundthatmicroinstancesaverage0.35ECUs,butcangoashighas2.5ECUsforafewsecondsatatime[18]. 27 PAGE 38 2 3 4 5 6 7 8 9 1Feb8Feb15Feb22Feb29Feb 0 10 20 30 40 50 60 70 80 90 100 aStandardSmallInstance Minimumprice2.7 g /hr bMicroInstance Minimumprice0.6 g /hr Figure3.4:AmazonEC2SpotPricesforFebruary2012 g /hour Microinstancesaredesignedforapplicationsthatarenotcomputationallyintensive, likeamodestwebserver,andarepoorlysuitedtocontinuousworklikethegenetic algorithm.However,theyaretheonlykindofinstancesupportedbythefreeusage allowance. AseriousproblemwithAWSisthattherearenomethodsforsettingbudget limitationsorprogrammaticallyaccessingusagereports.Carefulmonitoringmustbe employedtoensurethatamisconguredinstancedoesnotrunupalargebill. 3.4.2DeployingtheGeneticAlgorithmonAmazonEC2 TherstobstacletotheuseofAmazonEC2instancesintheclusteristhatthe geneticalgorithmisrunonacomputerinsideNewCollege,whichcannotbeaccessed fromtheInternetduetotheNewCollegerewall.Unfortunately,thisissueisexactly 28 PAGE 39 oppositetotherewallissuemyclusterplatformwasdesignedtoaddress.However, thisissueiseasytoresolvebecausefullcontrolisavailableovertheAmazonEC2 instances,whichisnotthecaseforthecomputersandrewallatNewCollege. Theobvioussolutiontothisissuewastousetheportforwardingcapabilityof theSSHsecureshellprotocol.SSHisapopularsystemforaccessingtheoperating systemcommand-lineofremotesystemsthatalsoprovidestheabilitytoforward networktrac,eitherallowingthelocalcomputertoaccessserversavailabletothe remotecomputerorvice-versa.Thisallowsnetworkactivitytotunnel"through therewallthroughtheSSHconnection.SSHiswidelyavailable,anditiseven preinstalledonAmazonEC2instances,asitistheonlywaytoaccessthesystem. However,thereweretwopointsthatstillneededtobeaddressed.First,thetunnel needstoberobust,andtobeautomaticallyreestablishedifitshouldbeinterrupted forsomereasone.g.anetworkoutage.Second,sincethetunnelmustbeinitiated frominsidetheNewCollegerewall,itisessentialfortheretobesomewayforthe distributedcomputingclienttorequestaccessandprovideitsIPaddresstotheserver. Aprogramwasdevelopedtoautomaticallyestablishthetunnel,monitoritsstatus, andreconnectthetunnelifitstopsworking.Theclientandserverprogramsnegotiate thetunnelviaapublicXMPPExtensibleMessagingandPresenceProtocolinstant messagingserver. Oncetherewallissuewasresolved,thenextstageofdeploymentwastodevelop aninstance.InordertominimizethemaintenanceeortrequiredtooperateAmazon EC2instances,Idecidedtouseastandardo-the-shelfAMI,ratherthanmaintaining acustomimage.Whenstartinganinstanceitispossibletospecifyuserdata"which isavailablefromwithintheinstance.MostLinuxAMIswillexaminetheuserdata atstartup,andexecuteitifitbeginswith #! "whichdenotesanexecutablescript onLinux/UNIX.Thisfeatureenabledmetodevelopashellscriptthatwouldset 29 PAGE 40 upanuncustomizedinstancebydownloadingnecessarysoftwaree.g.Perl,starting theXMPPtunneldaemon,andrunningthedistributedcomputingclient.Thescript canbeprovidedto ec2-run-instances tostartupanynumberofself-conguring instanceswithasinglecommand. ForMicroinstances,thepreferredAMIwas ami-80e915e9 ,a1GBimagerunning 64-bitDebianLinux.DebianLinuxwaschosenforthesimplicityofitsbaseinstallation;itdoesnotincludealotofunnecessaryfeatures.Thesmallsizeisadvantageous becauseMicroinstanceshavenobuilt-instorage,andmustuseElasticBlockStore, whichonlyhasafreeallowanceof30GB-monthspermonth.Forotherinstancetypes, thepreferredAMIwas ami-1cbc4375 ,whichalsoruns64-bitDebianLinux,butuses built-instorageinsteadofElasticBlockStore. Atypicalgeneticalgorithmrungoesfor1000generationswithapopulationsize of16384individuals.Thistakesabout48hourstorunonaclusterofoneHighCPUMediuminstanceECUs.Wethereforeconcludethatthegeneticalgorithm requires48 5=240ECU-hours.However,tomakeuseofthismeasurement,itis importanttoconsiderthestructureofthegeneticalgorithm.Onlytheevaluationof individualsisperformedonthecluster;thecreationofanewgenerationofindividuals fromthepreviousoneisnotparallelized.Forthisrun,about10%ofthetime secondspergenerationwasspentonthesenonparallelportions.Thisisimportant totakeintoaccountwhencalculatingthespeedupachievedforagivendivisionof 240ECU-hoursintomultipleinstances. 30 PAGE 41 Chapter4 Results Thegeneticalgorithmhasbeentestedonseveralspectraofavarietyofmolecules. IthassuccessfullytsimulatedspectraofSupranedesuane,C 3 H 2 F 6 O,2-diuoromethoxy-1,1,1,2-tetrauoroethaneat2Kand50K,however,ithasmuchmore troubleat298K.Ithasalsoperformedwellonanexperimentallyobservedspectrumat1K.Thesimulatedspectrumoftwovibrationalstatesofuorobenzene C 6 H 5 Fhasalsohadsomesuccess.Thegeneticalgorithmhashadmoretrouble ttinganobservedspectrumofstrawberryaldehydeC 12 H 14 O 3 ,ethyl3-methyl-3phenyloxirane-2-carboxylate,whichissubstantiallymorecomplexthantheother cases.Additionally,asimulatedmixtureofanisoleC 7 H 8 O,hexanalC 6 H 12 O,and orthouorotoluene-uorotoluene,C 7 H 7 F,1-uoro-2-methylbenzenehasbeenused toinvestigatethegeneticalgorithm'sperformanceonmixturesofmultiplespectra. Currently,oneofthethreespectrainthemixturehavebeensuccessfullyt. 31 PAGE 42 4.1Background Thekeyfeaturethatenabledthegeneticalgorithmtoconvergewastheabilityto limittherangeoftheparameters.AsoftwareprogramcalledGaussian[19]canbeused toproduceaninitialguessoftheparameterswithsomeuncertainty.Theestimate producedbyGaussianusuallyhasanuncertaintyofabout10%.Thisdramatically reducesthenumberofpotentialsolutions,andgivesthegeneticalgorithmmuchbetter oddsofconvergence. Thegeneticalgorithmisgenerallyrunfor1000generations,withapopulationof 16384individuals.Smallerpopulationscanhavemuchmoretroubleconverging,and largerpopulationsdonotseemtodoanybetter.Thiscantakeaslittleassixhoursto runonaclusterof150threads.Whilethismaynotbefasterthanmanualtting,it doeshavetheadvantageofrunningtheentiretimewithoutanyhumanintervention 1 Theruntimecanalsobereducedbyprovidingmorecomputationalcapacity. Thefrequencyrange{18300MHzisdividedinto1920bins,givingabin widthof5MHz.Thisissomewhatlargerthanthepeakwidthforexperimentalobservations,whichis20{40kHzforthelowtemperatureobservationsfromamolecular beamspectrometerandabout500kHzfortheroomtemperatureobservations.The seedvaluefortherandomnumbergeneratoris1748568663.Themutationprobabilityis0.015625,andelitismissetto128individuals.Uncertaintypropagation, dynamicmutation,anddoubleresonancearenotused.Randomizedbinsareturned oasitcausesthetnessfunctiontobeirregularoversuccessivegenerations,which makestheperformanceofthegeneticalgorithmfarmorediculttoanalyze. Asageneralresult,thegeneticalgorithmgetsthedistortionparameters D J 1 Aslongasthereisnohumanintervention{ifsomeoneelseinterruptsthegeneticalgorithm, you'llhavetorestartitmanually. 32 PAGE 43 D JK D K d J ,and d K verywrongineveryrun.Infact,thevaluesitproducesare oftensofarwrongastobeofoppositesignfromthecorrectanswer.Thedistortionparametersarefarlessimportantthantherotationalparameters,whichhelps thegeneticalgorithmtoproducereasonableresultsdespiteoutputtingunreasonable distortionparameters.However,thisislikelycausingtroublewithmorecomplex spectra,particularlyathighertemperatures,andshouldbeaddressedinfuturework. 4.1.1SpectralComplexity Inanexperimentalobservation,theremaybemanyoverlaidspectraofdierent conformers,isotopomers,andvibrationalstatesofthemolecule.Spectralcomplexity canbedenedasthenumberofdistinctspectraexpectedinanobservationwitha reasonable"signal-to-noiseratio. Eachmolecule{suchasC 3 H 2 F 6 O{mayhaveseveralpossiblethree-dimensional shapes,someofwhicharemorestablethanothers.Theseshapesarecalledconformationalisomers,orconformers.Asanexample,thevemostcommonconformers ofSupraneareshowninFigure4.1.Dierentconformershavedierentrotational behavior,andthusdierentspectra.Thenumberofanddistributionofconformersinanobservationdependsonseveralfactors.Ingeneral,largermoleculeshave moreconformersthansmallermolecules,becausethelargermoleculeismorelikely tohaveexiblebonds.Additionally,thepopulationofdierentconformersfollows aBoltzmanndistribution,whichdependsontemperature,soobservationsathigher temperaturesgenerallyhavemoreconformersthanobservationsatlowertemperatures. Isotopomersaremoleculesthatcontainanuncommonisotopeofoneormore atomswithinamolecule,suchasreplacingacarbon-12atomwithacarbon-13.Many atomshavemultiplestableisotopes,butthereisusuallyonlyonecommonisotope. 33 PAGE 44 Forexample,98.89%ofallcarbonatomsarecarbon-12.Dierentisotopeshave dierentmass,thereforedierentisotopomershavedierentmassdistributionsand dierentspectra.However,mostisotopomershaveverylownaturalabundance,so isotopomersdonotusuallycauseproblemswhenttingthedominantchemicalspecies inaspectrum. Moleculesmayalsoexhibitinternalvibration,whichaectstherotationalbehaviorofthemoleculeasawhole.Amoleculewith n atomswillhave3 n )]TJ/F15 11.9552 Tf 11.531 0 Td [(6vibrational modes,anycombinationofwhichmaybeexcited.Moleculesthathavenoexcited vibrationalmodesareinthevibrationalgroundstate,whichismostcommon.However,athighertemperatures,othervibrationalstatesmayalsobepopulated{at roomtemperaturetheremaybe5{10foreachconformer.Thisinternalmotionwill aecttherotationofthemoleculeaswell,andsothedierentvibrationalstateshave dierentspectra. Thereareanumberofotherhazardsthatmaymakeaspectrummorecomplex. F F F F F F O C C C a b c F F F F F C C F C O a b c CC C F F F F F F O a b c F F F O C F C C F F a b c C C a b c F F F F F F C O ConformerIConformerIIConformerIII ConformerVConformerIV Figure4.1:ConformersofSuprane.Adaptedfrom[20]. 34 PAGE 45 Themostpersistentoftheseisnoise.Thedatamayalsobeuntrustworthy,dueto asubtlymalfunctioningormiscalibratedspectrometer.Thisresultsinunexpected spectralbehavior,suchasthejumpinintensityseeninthestrawberryaldehydeobservationnear12.2GHz,seeninFigure4.2.Thechemicalsamplemayalsohave impurities,whichmaybecausedbythemanufacturingprocess,orresiduefrompreviousexperiments.Themajorityofimpuritiesexistatalowlevel,butsomehave extremelyintensespectrathatmayactuallybestrongerthanthesampleunderinvestigation.Methanolimpuritiesareparticularlyproblematicinthisregard.Finally, theremaybesomeuncertaintyintheparametersoftheobservationalenvironment, suchasthetemperature.Theaccuracyofthetemperatureestimateisespeciallyimportantatlowtemperatures,whereusinga2Kestimateforaspectrumobservedat 1Kmaybesucienttopreventthegeneticalgorithmfromconverging. 0 0.1 0.2 0.3 0.4 0.5 0.6 9000100001100012000130001400015000160001700018000 Intensityarbitraryunits FrequencyMHz Figure4.2:Experimentallyobservedspectrumofstrawberryaldehydeat298K 35 PAGE 46 4.2Suprane Supranedesuaneisaninhalantgeneralanesthetic.Itisasmallmolecule,with just10atoms,andhasonlyonedominantconformer.Themoleculeisshownin Figure4.3. Thedipolemomentsare A =1 : 476, B =0 : 758,and C =0 : 235allinDebyes. F F F F F F O C C C H H SupraneC 3 H 2 F 6 O 2-diuoromethoxy-1,1,1,2-tetrauoroethane Figure4.3:Suprane 4.2.1SimulatedSpectraofSuprane SimulatedspectraofSupranehavebeentbythegeneticalgorithmat2KFigure4.4and50KFigure4.5.Sofar,thegeneticalgorithmhasnotbeenableto tthe298KspectrumFigure4.6.Thenalvaluesfortheparametersarelistedin Table4.1,alongwiththeactualconstants[20].Thetnessandrotationalparameters areplottedateachgenerationinFigure4.7andFigure4.8,respectively. Eachrunwasstartedwithaparameterrangeof10%aboutaninitialinexact guessforthe A B ,and C parameters. At2K,thisisquiteasimpleproblemforthegeneticalgorithm.Theresultingspectrumisidenticaltothesimulatedspectrumbeingmatched.Infact,ithas 36 PAGE 47 0 2e-06 4e-06 6e-06 8e-06 1e-05 1.2e-05 1.4e-05 1.6e-05 1.8e-05 Intensityarbitraryunits 0 1e-05 2e-05 3e-05 4e-05 5e-05 6e-05 7e-05 8e-05 9000100001100012000130001400015000160001700018000 GAResult SupraneK 0 5e-07 1e-06 1.5e-06 2e-06 2.5e-06 3e-06 3.5e-06 Intensityarbitraryunits 0 2e-06 4e-06 6e-06 8e-06 1e-05 1.2e-05 1.4e-05 1340013500136001370013800 GAResult SupraneK FrequencyMHz Figure4.4:SimulatedspectrumofSupraneat2K 37 PAGE 48 0 1e-06 2e-06 3e-06 4e-06 5e-06 6e-06 Intensityarbitraryunits 0 1e-06 2e-06 3e-06 4e-06 5e-06 9000100001100012000130001400015000160001700018000 GAResult SupraneK 0 1e-07 2e-07 3e-07 4e-07 5e-07 6e-07 7e-07 8e-07 Intensityarbitraryunits 0 1e-07 2e-07 3e-07 4e-07 5e-07 6e-07 7e-07 1340013500136001370013800 GAResult SupraneK FrequencyMHz Figure4.5:SimulatedspectrumofSupraneat50K 38 PAGE 49 0 2e-07 4e-07 6e-07 8e-07 1e-06 1.2e-06 1.4e-06 1.6e-06 Intensityarbitraryunits 0 1e-06 2e-06 3e-06 4e-06 5e-06 6e-06 9000100001100012000130001400015000160001700018000 GAResult SupraneK 0 5e-08 1e-07 1.5e-07 2e-07 2.5e-07 3e-07 3.5e-07 4e-07 Intensityarbitraryunits 0 5e-07 1e-06 1.5e-06 2e-06 2.5e-06 3e-06 3.5e-06 4e-06 130001320013400136001380014000 GAResult SupraneK FrequencyMHz Figure4.6:SimulatedspectrumofSupraneat298K 39 PAGE 50 -0.1 -1 -10 -100 02004006008001000 Fitnessarbitraryunits SupraneK FitnessofBestIndividual AverageFitnessofPopulation -0.1 -1 -10 -100 02004006008001000 Fitnessarbitraryunits SupraneK FitnessofBestIndividual AverageFitnessofPopulation -0.1 -1 -10 -100 02004006008001000 Fitnessarbitraryunits SupraneK FitnessofBestIndividual AverageFitnessofPopulation Generation Figure4.7:GeneticalgorithmtnessovertimeforsimulatedspectraofSuprane 40 PAGE 51 -1 -0.5 0 0.5 1 050100150200250300 Percentdeviation SupraneK 8009001000 A B C -1 -0.5 0 0.5 1 050100150200250300 Percentdeviation SupraneK 8009001000 A B C -15 -10 -5 0 5 10 15 050100150200250300 Percentdeviation SupraneK 8009001000 A B C Generation Figure4.8:RotationalparametersovertimeforsimulatedspectraofSuprane 41 PAGE 52 GAResult ParameterActual2K50K298K A MHz2330.145712330.126132330.156652522.95425 B MHz865.37070865.36729865.38076972.76529 C MHz753.18860753.25772753.18331678.98248 D J Hz97.932236.96586886.752028-30.088627 D JK Hz-217.39-224.892847-106.150662-1904.611107 D K Hz982.1280.077976-110.320651-694.048721 d J Hz24.812-128.20630242.546332-1872.399536 d K Hz124.9-1361.310572-105.686406-223.297963 Table4.1:ResultsforsimulatedspectraofSuprane eectivelyconvergedtothecorrectanswerwithinjust100generations.The50K spectrumisslightlymoredicult.Onalargescaletheresultfor50Kisnearlyindistinguishablefromtheactualspectrum,althoughtherearesomedetailsthatdier whenzoomingin.Italsotakestwiceaslongforthegeneticalgorithmtoconvergeto thesolution. Ontheotherhand,at298K,thegeneticalgorithmhasclearlyfailedtoconverge. Reducingtheparameterrangefrom10%to5%,whichmakestheproblemmuch easier,isnotsucienttoimproveconvergence.Using600or6000binsinsteadof thestandard1920doesnothelpmucheither,although600binsprovidestheclosest resultsofarTable4.2. 42 PAGE 53 GAResultGAResultswith5%parameterrange ParameterActualStandard600bins1920bins6000bins A MHz2330.145712522.954252386.362392404.397502408.10777 B MHz865.37070972.76529866.60583942.74248940.92714 C MHz753.18860678.98248751.49294718.62516714.98181 D J Hz97.932-30.088627-271.9542031473.182123834.969954 D JK Hz-217.39-1904.611107-1973.542509-1696.444694-135.560303 D K Hz982.12-694.0487212062.324233-953.613032-1066.276793 d J Hz24.812-1872.3995361899.416808-2145.298150-2059.999475 d K Hz124.9-223.297963-796.211167733.574145-768.377560 Thestandardcongurationisa10%parameterrangewith1920bins. Table4.2:ResultsforsimulatedspectraofSupraneat298Kundervariousgenetic algorithmcongurations 43 PAGE 54 4.2.2ObservedSpectrumofSuprane AnobservedspectrumofSupranecollectedat1KhasalsobeentFigure4.9. Theparametervaluesproducedbythegeneticalgorithm,alongwiththeactualvaluesforcomparison,areinTable4.3.Thegeneticalgorithmtnessisplottedin Figure4.10,andthebehavioroftherotationalparametersisshowninFigure4.11. ParameterActualGAResult A MHz2330.145712329.88998 B MHz865.37070865.31232 C MHz753.18860753.16727 D J Hz97.932-229.312948 D JK Hz-217.39-1572.149845 D K Hz982.12-1680.333773 d J Hz24.812-35.601592 d K Hz124.9-1017.692821 Table4.3:ResultsforobservedspectrumofSupraneat1K Somewhatsurprisingly,thegeneticalgorithmdoesnothavemuchmorediculty withthisproblemthanwiththesimulatedspectrum.Infact,thetbecomesquite goodevenafterabout250generations.Thisshowsthatthegeneticalgorithmcan besupplieddatathatcontainsnoiseandstillproduceagoodsolution.Aswiththe observedspectra,thegeneticalgorithmwasprovidedwith10%parameterranges aroundinitialestimatesofthe A B ,and C parameters. 44 PAGE 55 0 2e-06 4e-06 6e-06 8e-06 1e-05 1.2e-05 Intensityarbitraryunits 0 0.0005 0.001 0.0015 0.002 0.0025 9000100001100012000130001400015000160001700018000 GAResult Suprane 0 1e-06 2e-06 3e-06 4e-06 5e-06 6e-06 7e-06 8e-06 9e-06 Intensityarbitraryunits 0 0.0002 0.0004 0.0006 0.0008 0.001 130001320013400136001380014000 GAResult Suprane FrequencyMHz Figure4.9:ObservedspectrumofSupraneat1K 45 PAGE 56 -0.1 -1 -10 02004006008001000 Fitnessarbitraryunits FitnessofBestIndividual AverageFitnessofPopulation Generation Figure4.10:GeneticalgorithmtnessovertimeforobservedspectrumofSuprane -10 -5 0 5 10 050100150200250300 Percentdeviation 8009001000 A B C Generation Figure4.11:RotationalparametersovertimeforobservedspectrumofSuprane 46 PAGE 57 4.3Fluorobenzene FluorobenzeneFigure4.12isanothersmallmolecule,with12atoms.Because ithasnoconformersandnonotableisotopomers.89%ofcarboniscarbon-12, 99.98%ofhydrogenishydrogen-1,anduorinehasonlyonestableisotope,ithas asmallnumberofoverlappingspectra,allofwhichareduetoexcitedvibrational states.Thedipolemomentsare A =1 : 555, B = C =0 : 0Debyes. F C C C C C C H H H H H Figure4.12:FluorobenzeneC 6 H 5 F Thisspectrumisacombinationofthesimulatedspectraoftwopreviouslyt vibrationalstates,namelythegroundand V 11 states[21].Whenrunonthisspectrum, thegeneticalgorithmconvergestorotationalparametersthatapproximatethoseof the V 11 vibrationalstateHowever,therotationalparametersforthetwovibrational statesarequitesimilar,andevenaminorperturbationtotheruncanbeenoughto causethegeneticalgorithmtoconvergetotheotherstate.Forexample,changing A to1.7051shouldhavelittleimpactbecause B = C =0,sotheonlyimpactshould betoscaletheexactbutnotrelativepeakintensities.However,thisisenoughto causethegeneticalgorithmtoconvergetothegroundstateinsteadof V 11 47 PAGE 58 ThevaluesforthevariousparametersarelistedinTable4.4,andtheresulting spectracanbeseeninFigure4.13.Whenrunwiththecorrectvalueof A ,thegenetic algorithmapproachesthe V 11 vibrationalstatequiteslowly,withtheparametervalues settlingdownaftergeneration600,althoughthetnesshasbeenrelativelystablesince generation300.Thiscanbeseeninthetnessandparameterswhenplottedover timeFigure4.14andFigure4.15.Withthemodied A ,thegeneticalgorithmgets verycloseinjustover100generations,althoughthetnesscontinuetoimproveuntil generation300Figure4.16andFigure4.17. Thegeneticalgorithmwasconguredwiththerotationalparameterslimitedtoa 10%rangeaboutaninitialguess. ActualGAResultActualGAResult ParameterGroundStateModied A V 11 Correct A A MHz5663.7134775666.440955641.448145637.04628 B MHz2570.6531832570.626772571.8048562572.00604 C MHz1767.9137201767.802511769.6934771769.71062 D J Hz133.19222064.431606133.97971619.719483 D JK Hz214.3376-1449.047262217.757809.784182 D K Hz836.01034.634023731.692040.015376 d J Hz43.55529-260.53550043.73931167.716450 d K Hz373.0502116.007571361.343-33.716201 Table4.4:Resultsforsimulatedspectrumofuorobenzeneat298K 48 PAGE 59 0 5e-08 1e-07 1.5e-07 2e-07 2.5e-07 3e-07 3.5e-07 4e-07 4.5e-07 Intensityarbitraryunits 0 2e-07 4e-07 6e-07 8e-07 1e-06 1.2e-06 9000100001100012000130001400015000160001700018000 GAResultcorrect A GAResultmodied A Fluorobenzene 0 2e-08 4e-08 6e-08 8e-08 1e-07 1.2e-07 1.4e-07 Intensityarbitraryunits 0 1e-07 2e-07 3e-07 4e-07 5e-07 130001320013400136001380014000 GAResultcorrect A GAResultmodied A Fluorobenzene FrequencyMHz Figure4.13:Simulatedspectrumofuorobenzeneat298K 49 PAGE 60 -0.1 -1 -10 02004006008001000 Fitnessarbitraryunits FitnessofBestIndividual AverageFitnessofPopulation Generation Figure4.14:Geneticalgorithmtnessovertimeforsimulatedspectrumofuorobenzenecorrect A -4 -2 0 2 4 050100150200250300 Percentdeviation 8009001000 A B C Generation Figure4.15:Rotationalparametersovertimeforsimulatedspectrumofuorobenzene correct A 50 PAGE 61 -0.1 -1 -10 02004006008001000 Fitnessarbitraryunits FitnessofBestIndividual AverageFitnessofPopulation Generation Figure4.16:Geneticalgorithmtnessovertimeforsimulatedspectrumofuorobenzenemodied A -4 -2 0 2 4 050100150200250300 Percentdeviation 8009001000 A B C Generation Figure4.17:Rotationalparametersovertimeforsimulatedspectrumofuorobenzene modied A 51 PAGE 62 4.4StrawberryAldehyde Strawberryaldehydeisachemicalusedforarticialstrawberryavor[22],shown inFigure4.18.Itisverylarge,with29atomsandveprimaryconformers,and wasexperimentallyobservedat2K.Thesefactorsconspiretoproduceacomplex observation,withmanyoverlaidspectra,whichmakesttingstrawberryaldehyde muchmoredicultthanSupraneoruorobenzene.TheveconformersarecisI, cisII,transI,transII,andtransIIIalsoreferredtoasc-sg )]TJ/F15 11.9552 Tf 7.085 -4.339 Td [(,c-ag + ,t-aa,t-ag )]TJ/F15 11.9552 Tf 7.085 -4.339 Td [(,and t-ag + ,respectively.ThedipolemomentsforeachconformerarelistedinTable4.5. O O O C C C C C C C C C C C C H H H H H H H H H H H H H H StrawberryAldehydeC 12 H 14 O 3 ethyl3-methyl-3-phenyloxirane-2-carboxylate Figure4.18:Strawberryaldehyde Conformer A B C cisI1.0000.4200.000 cisII0.0000.2951.000 transI0.0001.0000.403 transII0.0001.0000.639 transIII0.0001.0000.253 Table4.5:Relativedipolemomentsforstrawberryaldehyde 52 PAGE 63 Thegeneticalgorithmdoesnotdowellonthisspectrumwitha10%rangearound aninitialguessforthe A B ,and C parameters.Theresultsfortheveconformers areshowninTable4.6.ThecisIconformerFigure4.19,liketheothers,looks likeapoormatch,howeverthisistheconformerforwhichthegeneticalgorithmhas comeclosesttothecorrectrotationalparametervaluesFigure4.20.Therestofthe conformersareevenfurtherawayfromthecorrectanswerFigure4.21.Although cisII,transI,andtransIIhave A valuesthatareallveryclosetothecorrectanswer, thisisnotsucientforagoodt. cisIcisII ActualGAResultActualGAResult A 728.09520735.65712723.14170723.73782 B 628.69170623.54379581.41140652.22016 C 429.84840429.85050421.92670449.49666 D J 80.9177.116743343.1-113.424885 D JK -107.0341.933560-863.0625.378198 D K 90.0-1847.054874657.0-231.891786 d J -25.7113.585467-119.4-1526.057502 d K 2.3-35.2557228.6-1999.163201 transItransIItransIII ActualGAResultActualGAResultActualGAResult A 1214.72791214.667101460.52301460.379361330.94931418.73764 B 287.76661312.47136275.04586306.14388293.48150328.68402 C 269.46241243.01864269.61760238.56491281.96590261.10250 D J 17.41-1735.94616410.80-988.97450517.201739.213071 D JK -77.01820.617305-50.02089.843010-82.01522.732456 D K 820.0-651.3072621350.01359.150777950.0-1739.307542 d J -3.410286.829855-0.91-973.572256-3.142101.935070 d K 0.800-2119.3114950.700795.935082-0.750466.619458 Parameters A B ,and C areinMHz; D J D JK D K d J ,and d K areinHz. Table4.6:Resultsforsimulatedspectrumofstrawberryaldehyde 53 PAGE 64 0 1e-06 2e-06 3e-06 4e-06 5e-06 6e-06 Intensityarbitraryunits 0 0.1 0.2 0.3 0.4 0.5 9000100001100012000130001400015000160001700018000 GAResultcisI StrawberryAldehyde 0 5e-07 1e-06 1.5e-06 2e-06 2.5e-06 3e-06 3.5e-06 4e-06 4.5e-06 5e-06 Intensityarbitraryunits 0 0.05 0.1 0.15 0.2 0.25 0.3 130001320013400136001380014000 GAResultcisI StrawberryAldehyde FrequencyMHz Figure4.19:Observedspectrumofstrawberryaldehydeat2K,conformercisI 54 PAGE 65 -15 -10 -5 0 5 10 15 050100150200250300 Percentdeviation cisI 8009001000 A B C -15 -10 -5 0 5 10 15 050100150200250300 Percentdeviation cisII 8009001000 A B C Generation Figure4.20:RotationalparametersovertimeforobservedspectraofcisIandcisII conformersofstrawberryaldehyde 55 PAGE 66 -15 -10 -5 0 5 10 15 050100150200250300 Percentdeviation transI 8009001000 A B C -15 -10 -5 0 5 10 15 050100150200250300 Percentdeviation transII 8009001000 A B C -15 -10 -5 0 5 10 15 050100150200250300 Percentdeviation transIII 8009001000 A B C Generation Figure4.21:RotationalparametersovertimeforobservedspectraoftransI,transII, andtransIIIconformersofstrawberryaldehyde 56 PAGE 67 4.5Anisole/Hexanal/OrthouorotolueneMixture Onefactorthatcontributestothedicultyofthestrawberryaldehydetis thattheparametervaluesaresimilarforthedierentconformers.Themoleculesin thismixture{anisoleFigure4.22a,hexanalFigure4.22b,andorthouorotoluene Figure4.22c{werechosentoavoidthisproblem,hopefullyresultinginaneasier taskforthegeneticalgorithm.Thedipolemomentsforeachmoleculearelistedin Table4.7.Themixturewassimulatedat5K. C C C C C C O H H H H H H H H H H H H bHexanalC 6 H 12 O O C C C C C C C H H H H H H H H aAnisoleC 7 H 8 O H H H H H H H C C C C C C C F cOrthouorotoluene-uorotoluene, C 7 H 7 F,1-uoro-2-methylbenzene Figure4.22:Anisole,hexanal,andorthouorotoluene TheresultingandactualparametervaluesareinTable4.8.Withthesame10% 57 PAGE 68 Molecule A B C Anisole0.751.000.00 Hexanal1.252.250.00 Orthouorotoluene1.001.000.00 Table4.7:DipolemomentsformixturecomponentsinDebyes rangeontherotationalparametersaswiththeotherruns,thegeneticalgorithmonly convergedfororthouorotoluene.Inthatcase,theresultingspectrumFigure4.23 matchescloselywithpeaksinthemixturespectrum.Itcanbeseenfromtheplotof rotationalparametersFigure4.24thatconvergenceoccurredinunder300generations. AnisoleHexanalOrthouorotoluene ActualGAResultActualGAResultActualGAResult A 5028.843885367.458419812.15188845.684153243.147823243.37789 B 1569.364251651.21873873.82491959.330612180.432672180.57617 C 1205.825541111.85832822.58606824.193141314.385721314.38399 D J 60.36025-1372.91133646.63251893.658002293.6731404.864505 D JK 40.550301700.561318-869.77224-962.386727-855.613-19.145862 D K 784.19114-778.81193424456.1181997.735625712.3381854.041966 d J 15.93986485.4702924.97571787.405679112.865501.705549 d K 178.8083748.182252-379.285751696.629073-314.806-651.743241 Parameters A B ,and C areinMHz; D J D JK D K d J ,and d K areinHz. Table4.8:Resultsforsimulatedspectrumofmixture 58 PAGE 69 0 1e-06 2e-06 3e-06 4e-06 5e-06 6e-06 7e-06 8e-06 Intensityarbitraryunits 0 5e-07 1e-06 1.5e-06 2e-06 2.5e-06 3e-06 3.5e-06 4e-06 4.5e-06 9000100001100012000130001400015000160001700018000 GAResultOrthouorotoluene Mixture 0 5e-07 1e-06 1.5e-06 2e-06 2.5e-06 3e-06 3.5e-06 Intensityarbitraryunits 0 5e-07 1e-06 1.5e-06 2e-06 130001320013400136001380014000 GAResultOrthouorotoluene Mixture FrequencyMHz Figure4.23:Observedspectrumofmixturecomponentorthouorotoluene 59 PAGE 70 -15 -10 -5 0 5 10 15 050100150200250300 Percentdeviation Anisole 8009001000 A B C -15 -10 -5 0 5 10 15 050100150200250300 Percentdeviation Hexanal 8009001000 A B C -2 0 2 4 6 8 10 12 050100150200250300 Percentdeviation Orthouorotoluene 8009001000 A B C Generation Figure4.24:Rotationalparametersovertimeforsimulatedspectrumofthemixture 60 PAGE 71 Chapter5 Conclusions Thegeneticalgorithmhashadsomesuccessforsimplespectra,allconverging within300generations.IthassuccessfullytsimulatedspectraofSupraneat2K and50K.Ithasalsotanexperimentallyobservedspectrumat1K,whichdemonstratesthegeneticalgorithm'sabilitytosucceeddespitethepresenceofnoise,sample impurities,andotherexperimentalartifacts.Whenrunona298Ksimulatedcombinedspectrumoftwovibrationalstatesofuorobenzene,thegeneticalgorithmhas gottenveryclosetobothstates.Thegeneticalgorithm'schoiceofstateseemsto dependontheexactparametersoftherun.Finally,onasimulatedmixtureofthe 2Kspectaofanisole,hexanal,andorthouorotoluene,thegeneticalgorithmhasbeen abletoconvergeontheorthouorotoluenecomponent. However,ithasnotbeenassuccessfuloneveryspectrumithasbeentestedon. Inparticular,ithasfailedtotthe298KsimulatedspectrumofSuprane.Ithasalso beenunabletottheanisoleandhexanalcomponentsofthe3-moleculemixture. Thegeneticalgorithmhasbeenunsuccessfulwithallveconformersofthe2K experimentalobservationofstrawberryaldehydeaswell. 61 PAGE 72 5.1FutureWork Thereareseveralaspectstotheprojectthatmaybenetfromadditionalresearch. Whilethegeneticalgorithmcangetveryclosetothecorrectrotationalparameters A B ,and C ,itconsistentlyfailstoproducereasonablevaluesforthedistortion parameters D J D JK D K d J ,and d K .Thisisfarmoreimportantforroomtemperaturespectrathanforlowtemperaturespectra,andmaybecontributingtothe poorperformanceon298KSuprane.Thegeneticalgorithm'sabilitytoproducesane distortionparametersshouldbeimproved. Thegeneticalgorithmhasbeenfoundnottoworkverywellwithmultipleoverlaid spectra.Currently,thegeneticalgorithmisrunindependentlyoneachcomponent spectrum.Acooperativemethodforttingthecomponentspectramaygreatlyimprovetheperformanceofthegeneticalgorithminthesecases.Expandingthetnessmeasurementtoconsiderthequalityofproposedsolutionsfromeachrunwhen combinedwithproposedsolutionsfromtheconcurrentrunswillhelppreventthe individualrunsfrommatchingtoomanypeaksthatarepartofotherspectra.The combinationevaluationsgreatlyincreasestheamountofcomputationthatisrequired, butmayproducesubstantiallyimprovedresults. ItmaybeusefultochainthegeneticalgorithmtoIanFinneran'striplestter"[4] automatedttingprogram.Oneofitsdisadvantagesisthatitrequiresamuchmore preciseinitialguessthanthegeneticalgorithmdoes.Therefore,evenifthegenetic algorithmdoesnotconverge,itmaystillbeabletoreducetheuncertaintyrangeon therotationalparameters.Developinganinitialguessforthetriplestterfromthe topindividualsfromthegeneticalgorithmmayallowthetriplesttermaytolocate theexactsolution,eventhoughneitherprogramcouldnditalone.Itmayalsobe usefultochainmultiplegeneticalgorithmrunstogether,withgradualadjustments 62 PAGE 73 totheparametersbinsize,parameterrange,numberofindividuals,etc. Itisalsobepossiblethatsomeotheroptimizationstrategymaybemoresuitable forthisproblemthanageneticalgorithm.Ingeneral,theeectivenessofdierent searchstrategiesforagivenproblemseemstobehighlyvariable,andeachsearch strategyseemstohaveafewproblemsinwhichitspecializes.Evenwhileretaining thesametnessfunction,achangetotheunderlyingoptimizationalgorithmwillstill resultinaradicallydierenttraversalofthelandscapeofpotentialsolutions.Parallelsimulatedannealingalgorithms[23]areparticularlypromising,butavarietyof methodsshouldbeconsidered.Thisisrelativelyeasytoimplementbecausethespectroscopytnessfunctionisessentiallyindependentfromtheactualgeneticalgorithm, andcanlikelybereusedformostoptimizationalgorithms. Thereisalsoroomforperformanceimprovements.Currently,thegeneticalgorithmspendsanoticeableamountoftimecreatinganewgenerationfromtheprevious one.Thisprocesscouldpotentiallybenetfromoptimization,andmayevenbeparallelizable. 63 PAGE 74 AppendixA DistributedComputingProtocols Thisappendixdocumentstwoofthedistributedcomputingprotocols. Therstprotocolisusedbythegeneticalgorithmtosendthelistofindividuals tothedistributedcomputingsystem.Adistributorprogramimplementingthisprotocolcanbeusedtosendworkfromthegeneticalgorithmtosomeotherdistributed computingplatform. Thesecondprotocolisusedtosendworkunitstothedistributedcomputingserver. AprogramimplementingthisprotocolcansendworkfromsomeotherkindofcomputationtotheGuerrillaCluster.Thisalsoinvolvesimplementingaplug-inforthe distributedcomputingclientanapp"toperformyourcomputation.Information aboutimplementingtheappisalsoincluded. AllprotocolcommunicationsareperformedoverTCPsockets.ThegeneticalgorithmsupportsbothIPv4andIPv6;however,thedistributedcomputingserver currentlysupportsonlyIPv4.ProtocolimplementorsshouldassumeUTF-8characterencoding.Theprotocolsarelineoriented,andimplementationsmustaccept either CRLF or LF lineendings. 64 PAGE 75 A.1PortingtheGeneticAlgorithmtoOtherDistributedComputingPlatforms When ga-spectroscopy thegeneticalgorithmisrunwiththe --distributed hostname : port option,itwillestablishaTCPconnectiontothedistributorprogram atthataddressandusethisprotocoltosendindividualstothedistributorandreceive tnessvaluesinreturn.Eachincomingconnectiontothedistributorcorrespondsto aseparategeneticalgorithminstance. Uponconnectingtothedistributor,thegeneticalgorithmsendsastanzacongurationinformation;anexampleisshowninFigureA.1.Theaccumulatedconguration mustbesavedandsenttotheworkercomputers.Congurationinformationisformattedaslinesbeginningwith CFG followedbyanoptionnameandargument,which mapdirectlytocommand-lineorcongurationleoptionsacceptedbythegenetic algorithm.Acongurationstanzawillendwithalinebeginningwith V .Thedata onthe V lineisusedbythegeneticalgorithmtoensurethatthegeneticalgorithm versionisthesameonallcomputers. CFGFmatchdata/suprane.cat CFGFtemplatedata/template-suprane CFGFamin207000000 CFGFamax253000000 CFGAconfigdata/suprane-10percent.conf V6c92b3417368469948fc6dac1e30920b FigureA.1:Anexampleofacongurationstanza Somecongurationoptionsspecifydatales.Thedistributorisresponsiblefor makingsurethatallrequireddatalesareavailableontheworkercomputers.It alsomayneedtomodifythecongurationdatatohavethecorrectlocationofthese lesoneachworkercomputer.Notalldatalesusedbythegeneticalgorithmare 65 PAGE 76 requiredontheworkercomputers. Itmaybebenecialtoremovesomeduplicateoptionspecications,asthelast occurrencetakesprecedenceinmostcases.However,afewoptionsmayusefullybe speciedmultipletimes,sonotalloptionscanbeoptimizedthisway.Nonetheless, thisoptimizationisimportant,assomeoptionsmaychangeforeachgeneration. Thedistributorshouldalsoavoidsendingdatalesthatareobsoletedbymultiple occurrencesofanoption. Additionalcongurationinformationmaybeprovidedatanytime.Thismay occurifthereisaparameterthatisbeingvariedforeachgeneration.Additional congurationwillbeformattedinthesamewayastheinitialstanza,beginningwith a CFG lineandendingwitha V line.Thenewoptionstakeeectimmediately. Afterthedesiredcongurationhasbeensent,thegeneticalgorithmwillsendthe listofindividuals.Eachindividualisformattedas I ,followedbythenumberofthe individual,andthevarioussegmentsofthegenome.Afterthelistofindividuals,the generationnumberisprovidedasalinebeginningwith G .Thisisfollowedbythe statement DISPATCH ,whichindicatesthatthecompletelistofindividualshasbeen providedandthedistributormaybeginevaluatingtheindividuals.Anexampleis showninFigureA.2. I0d37e0a54dd86894342998285d3afe5dcbf4faa5bd2198c276cfc8846eaff0 I1ddad2575c719fd42119674b4aada7f3cfdd966290882df31a2304609fb80 I2eadaefc504a29f48f19106af38819ed6f96de9a7a8275796f5e2196242804 I3c62f206509903f4d53f5f94d349ba584c8a65150c00cc4dfc596eb11b8dbb I4d5bb26153dc06c4c5c6135f9bf0a8fc398771a200839df682ede7baeeb12 G0 DISPATCH FigureA.2:Anexamplelistofindividuals Thedistributormaynowcomposeworkunitsthatcontainasubsetoftheindividualsinthepopulation.Thesewillbeevaluatedbythe ga-spectroscopy-client 66 PAGE 77 program,asubsetof ga-spectroscopy thatrunsontheworkercomputers.Inadditiontorequireddatales, ga-spectroscopy-client requiresaninputlecontaining thecongurationandpopulationsubsetdata. ga-spectroscopy-client storesthe tnessofeachindividualandstoretheresultinanoutputle.Acompleteexample isshowninFigureA.3. CFGFmatchdata/suprane.cat CFGFtemplatedata/template-suprane CFGFamin207000000 CFGFamax253000000 CFGAconfigdata/suprane-10percent.conf V6c92b3417368469948fc6dac1e30920b G0 I3c62f206509903f4d53f5f94d349ba584c8a65150c00cc4dfc596eb11b8dbb I4d5bb26153dc06c4c5c6135f9bf0a8fc398771a200839df682ede7baeeb12 aInputle F0003-1641.962076E F0004-712.843783E bOutputle FigureA.3:Anexample ga-spectroscopy-client input/outputcombination Theoutputlecontainslinesbeginningwith F ,followedbythenumberofthe individual,thetnesss,andanend-of-lineindicator E .Thedistributorreturnsthese linestothegeneticalgorithmwithoutmodication.Theymaybereturnedinanyorder.Oncealloftheindividualshavebeenevaluated,thedistributorsendsthegenetic algorithmtheline DONE .Thegeneticalgorithmmaythensendtheindividualsofthe nextgeneration.Thegeneticalgorithmwillcloseitsconnectiontothedistributor whenitcompletesi.e.therearenomoregenerations. 67 PAGE 78 A.2SendingWorktotheGuerrillaCluster TorunscienticcomputationsontheGuerrillaCluster,itisnecessarytoimplementtwosoftwarecomponents.Therstisadistributorthatimplementsthe communicationsprotocoldescribedinthissection.Thesecondisanapp,"which runsonworkercomputersaspartofthedistributedcomputingclient,andincludes allsoftwarerequiredtoperformthenecessarycomputations. A.2.1CommunicationsProtocol ThedistributorestablishesaTCPconnectiontoaknowndistributedcomputing server.Theserverwillgreetthedistributorwitha HELLO response,whichthedistributormustrespondtoappropriately.Thedistributormaythenrequestaworker computerwiththe HAVEWORK command.Thisprotocolencodes object susingJSON JavaScriptObjectNotation. Commands HAVEWORK Requestaworkerfromtheserver.Ifaworkerisavailable,theserverwillreturn a WORKER responsewithinformationaboutanavailableworker.Ifnot,itwill returna NOWORKERS response. HELLO versionnamekey version mustbe 0 zero. name shouldbeauniqueprex,usedforthenames ofallworkunitssubmitted. key istheserver'sworkacceptancepassword,which theserveroutputswhenitstartsup.Ifthe key iscorrect,theserverwillreturn an OK response. 68 PAGE 79 NOMOREWORK Optional.Indicatetotheserverthatyouwillnotbesubmittinganymore workunitsuntilallcurrentlyexecutingworkunitsarecompleted.Thiswillcause theservertobemoreaggressiveatcancelingworkunitsthatareoverdue. DISPATCH object object isaworkunit. id string Auniqueidentierfortheworkunit.Shouldcontainonlyalphanumeric characters,plushyphen,andmustbeginwiththeprexspeciedinthe HELLO command. duration integer Theexpecteddurationoftheworkunit,inseconds.Theworkunitmaybe canceledifittakeslongertoexecutethanthespeciedduration. files array Alistofinputles.Eachelementofthearrayisanotherarrayofthree elements.TherstelementistheMD5checksumofthele,thesecond istheHTTPor data: URLofthele,andthethirdisthenameofthe leontheworkercomputer.Oneofthelesmustbetheapp,whichmust havealenameontheworkerendingin .app upload string AnHTTPURLtoanuploadserviceforoutputles,or data: .Each outputleswillbesenttothisURLusingaseparate HTTPPUT request. Theresponsetothisrequestuploadshouldbeasimpleidentiere.g. alphanumericcharactersplusperiod,hyphen,andunderscorethatcan beprovidedinthe WORKFINISHED responseastheuploadedlocation"of 69 PAGE 80 thele.Thevaluemustbesucienttoallowthedistributortolocate theuploadedle.Ifthevalueofthiskeyis data: ,a data: URLwillbe generatedforeachoutputleandthegeneratedURLwillbeusedasthe uploadedlocation.Thisisrecommendedforsmalllesonly. worker string Theuniqueidentierfortheworkertheworkunitshouldbesentto. Responses ERR string Anerrorhasoccurred; string isahuman-readabledescription. HELLO version Greetingresponsesentbytheserveruponconnection. version istheversionof theserversoftware.Thedistributormustreplywitha HELLO command. NEWWORKER Sentbytheservertonotifythedistributorthatanidleworkerisavailable.The distributormayrespondwitha HAVEWORK commandifithasworkavailable. NOWORKERS Sentbytheserverinreplytoa HAVEWORK commandifnoworkersareavailable.A subsequent WORKFINISHED WORKFAILED WORKREJECTED ,or NEWWORKER response indicatesthataworkermighthavebecomeavailable. OK string Responsetoa HELLO command,indicatingthatthedistributorisproperlyauthenticated; string isahuman-readabledescription. 70 PAGE 81 WORKER object Sentbytheserverinreplytoa HAVEWORK commandifaworkerisavailable. object containsinformationabouttheavailableworker,andmayhavethefollowingkeys: id string Theuniqueidentierforthisworker.Usethisvaluetorefertotheworker inthefuture. name string Ahuman-readablenamefortheworker.Useonlyfordisplaypurposes. Theserverdoesnottrackthespeedofeachworker.Ifyouwishtoadaptthe sizeofyourworkunitsbasedonthespeedoftheworkersuggested,youwill needtotrackthisinformationyourself. WORKACCEPTED id Theworkunitwithidentier id hasbeenforwardedtothespeciedworker. WORKFAILED object WORKFINISHED object WORKREJECTED object The WORKFAILED responseindicatesthatthenamedworkunitfailedtocomplete successfully.Theworkunitmayhaveexpiredorbeencanceled,ortheworkunit mayhaveencounteredanerrorduringitscomputation. WORKFINISHED indicatesthattheworkunitcompletedsuccessfully.The files keycontainsinformationabouttheoutputdata. WORKREJECTED denotesthattheworkunitwasnotforwardedtotheworker. Forexample,theworkunitobjectmaybemissingrequiredinformation,orthe 71 PAGE 82 requestedworkermightbeunavailable.Inparticular,iftheworkunitobjectis missingthe id key,itmaynotbepresentinthisresponse.Human-readable detailsareincludedinthe error keyofthe object The object maycontainthefollowingkeys: id string Theuniqueidentierfortheworkunit. error string Ahuman-readabledescriptionofanerror.Notexpectedina WORKFINISHED response. files array Alistofoutputles.Eachelementofthearrayisanotherarrayofthree elements.TherstelementistheMD5checksumofthele,thesecondis thenameofthele,andthethirdisthelocationoftheuploadedversion oftheleeitheratokenreturnedbytheuploadserviceora data: URL. Shouldbeexpectedonlyina WORKFINISHED response. Inthecaseof WORKFAILED or WORKREJECTED ,theworkcontainedinthefailed workunitwasnotcompletedandtheserverwillnotretrytheworkunit.The distributorisresponsibleforresubmittingtheworkaspartofasubsequent workunit. A.2.2TheApp Theappissenttoeachworkercomputerandcontainsthesoftwarecomponents necessarytoperformthecomputations.Theappisdistributedasagzip-compressed tarlewithalenameendingin .app .Thearchiveisorganizedwithonedirectoryfor 72 PAGE 83 eachsupportedoperatingsystemandhardwarearchitecturecombinationcurrently win32 win64 linux-x86 ,and linux-x86 64 ,andanadditionaldirectorynamed all ,whichcontainsoperatingsystem-andarchitecture-neutralles.Thelesin theappropriateplatformdirectoryandtheplatform-neutraldirectoryareextracted bythedistributedcomputingclientintotheworkingdirectory.Forexample,the ga-spectroscopy appcontainsthelesinFigureA.4. all/app.plPerlscript win32/ga-spectroscopy-client.exe-bitWindowsexecutable win32/spcat.exe-bitWindowsexecutable win64/ga-spectroscopy-client.exe-bitWindowsexecutable win64/spcat.exe-bitWindowsexecutable linux-x86/ga-spectroscopy-client-bitLinuxexecutable linux-x86/spcat-bitLinuxexecutable linux-x86_64/ga-spectroscopy-client-bitLinuxexecutable linux-x86_64/spcat-bitLinuxexecutable FigureA.4:Contentsof ga-spectroscopy.app Theclientthenloadsthe app.pl le.ThisisaPerlscriptwhichcontainsproblemspeciccode,andshouldfollowtheexampleinFigureA.5.Thisexampleappreadsa numberfromaninputle,slowlysquaresit,andthenreturnstheresultinanoutput le.Mostappswillprobablystartanexternalprogramtoperformthecomputation. Inthiscase,the ga-spectroscopy app gaspecclient.pl inthesourcedistribution, seeAppendixBprovidesagoodexampletoworkfrom.Theappshouldremove inputlesitdoesnotexpecttoreuseforfutureworkunits. Filenamesforinputlesmayhaveatmostonesubdirectorywhichmayonlycontainalphanumericcharactersandthelenamewhichmayonlycontainalphanumeric characters,period,hyphen,andunderscore.Forexample, temp/hello.txt and output-62442.log arebothpermitted,but ../outside.dat data/revised/suprane.cat ,and totallyawesome/filenamewithspaces arenotallowed. 73 PAGE 84 packageDistClient::SlowSquare;#ChangeSlowSquaretosomethingunique usethreads::shared; usewarnings; usestrict; subWork{ my$id,$obj=@_;#$idisworkunitID,$objisworkunitobject #Readthesecondinputfilethefirstinputfileistheapp my$infile=$obj->{files}[1][2]; openF,'<',$infileordie"Failedtoread$infile"; #Readfirstlinefrominfile,converttonumber my$number= PAGE 85 AppendixB SourceCode Thisappendixcontainsminimalsourcecodeforthegeneticalgorithmanddistributedcomputingplatform.Toconservespace,lesthatarepresentinthefull versionofthesourcecodebutnotrequiredtooperatethegeneticalgorithmordistributedcomputingplatformarenotincludedinthisprintededition.Omittedles includeconguration,template,anddatales;helperandresultsanalysisprograms; andthegraphicalandtextualinterfacestothedistributedcomputingclient. Forconvenience,eachlineisnumberedandchecksummed.Thelinenumber, checksum,andimmediatelyfollowingspacecharacteroneachlinearenotpartofthe le.Spacesareused,ratherthantabs,inalllesexcept Makefile ,whereatabis usedforindentation.Therearenotrailingwhitespacecharactersonanyline.The checksumformatistherstfourcharactersofthehexadecimalrepresentationofthe SHA1checksumforeachline.Theentireline,excludinganyend-of-linecharacters, isusedwhencomputingthechecksum.UTF-8characterencodingisassumed. TextsampleforOCR!"#$%&'*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO PQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~ TextsampleforOCR!"#$%&'*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO PQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~ 75 PAGE 86 File1: ga-spectroscopy.c 14524/**filega-spectroscopy.c 25a96* 3c0bb*Solvethespectroscopyproblem. 4c4d2*/ 5da39 6d422#include PAGE 87 151c880intrc; 1522a8dintk; 153a8c1fork=0;k<2;k++{ 154d7bcsnprintffilename,fnsize,"%s.%s",basename, 155c60ak==0?input_suffixes[j]:output_suffixes[j]; 156044brc=openfilename, 1574a45O_CREAT|O_EXCL|O_NOFOLLOW|O_WRONLY,S_IRUSR|S_IWUSR; 1589e9fifrc<0{ 1599355iferrno!=EEXIST{ 160fc04printf"Erroropeningtemporaryfile%s:%sn", 1611325filename,strerrorerrno; 1621802printf"Doesdirectory%sexist?--tempdirtochangen",dir; 1638c3freturnNULL; 1642b9f} 165c971break; 1662a9a} 1672888touched++; 168263ccloserc; 1696b4e} 1703923ifrc<0break; 171e266} 17248d0} 173be07freefilename; 1745ac6returnbasename; 175c2b7} 176236b#endif 177da39 17807f1intload_spec_templatesspecopts_t*opts{ 1797ffdinti; 1800cbbchar*filename; 1810055iffilename=mallocstrlenopts->template_fn+5==NULL{ 182b5fareturn50; 18348d0} 184aa09memsetopts->template,0,sizeofopts->template; 185c371fori=0;i<2;i++{ 186ab1cFILE*fh; 1879ccasprintffilename,"%s.%s",opts->template_fn,input_suffixes[i]; 1882ead/*Loadtemplate*/ 189c491iffh=fopenfilename,"r"==NULL{ 190dc59printf"Failedtoopentemplatefile:%sn",strerrorerrno; 1914cd1freefilename; 19207ffreturni+10; 193e266} 1948b86freadopts->template[i],sizeofopts->template[i],1,fh; 195bafaifferrorfh{ 196556aprintf"Failedtoreadtemplatefile:%sn",strerrorerrno; 1974cd1freefilename; 198483ereturni+20; 199e266} 200c09bfclosefh; 20148d0} 202be07freefilename; 20334e8return0; 204c2b7} 205da39 2063235/**ParseQuantumNumbersfromSPCAT.CATfilesSeespinv.pdf,note 2070c68*thatdocumentationincorrectlyclaimsvalidrangeis-259..359 2085a96* 209e2a4*returnsQuantumnumberinrange-269..359,9999todenote 2103cb4*overflow"**",-9999todenoteparseerror. 211c4d2*/ 212027aintparseqnconstchar*str{ 2138c7fintval; 2149707//printf"'%s'n",str; 2152834ifstr[0]=='*'||str[1]=='*'return9999;//>359or<-269 216339dif!isdigitstr[1]return-9999; 21774bbval=str[1]-'0'; 218b3b3ifisspacestr[0];//Donothing 21991e1elseifisdigitstr[0]val+=10*str[0]-'0'; 2208119elseifisalphastr[0]{ 2212cf2ifislowerstr[0]val=-val-10*str[0]-'a'+1; 222ae27elseval=val+10*str[0]-'A'+100; 22348d0} 2247c42elseifstr[0]=='-'val=-val; 2259133elsereturn-9999; 22651ddreturnval; 227c2b7} 228da39 22993adintload_catfileFILE*fh,datarow**storage,int*size,int*count{ 230f779/* 231df6ffloata,b,maxb=0; 23283eechartrailing[1024]; 233e5c3*/ 234a5d3floatmaxlgint=0; 235dff5inti=0,j=0; 236461ewhile1{ 2376f99floatfreq,err,lgint; 238dbb4intqntype=0; 23928f4struct{ 240c0b0/*Thisstructstoresthesplitfieldsofthecatfile.*/ 241dc0bcharfreq[14],err[9],lgint[9],dr[3],elo[11],gup[4],tag[8], 2422e46qnfmt[5],qn[128]; 243af6e}s; 244dc1e/*Usingastructmakesitveryeasytozeroout.*/ 2458973memset&s,0,sizeofs; 246da39 24772deif!fhbreak;/*Emptymemfilewasn'topened*/ 248da39 2494373/*Killnewlinescharacters*/ 2507210fscanffh,"%*[rn]"; 251da39 252f900/*Formatoflinesof.CATfileisdescribedinspinv.pdf*/ 2537977/*QNisonly24characters,butalsoreadanyextratrailingdata.*/ 2542193i=fscanffh,"%13c%8c%8c%2c%10c%3c%7c%4c%127[^rn]",s.freq,s.err, 25535c4s.lgint,s.dr,s.elo,s.gup,s.tag,s.qnfmt,s.qn; 256fab6ifi==EOF&&ferrorfh{ 2578c42printf"FailedtoreadCATfile:%sn",strerrorerrno; 25846c5return3; 259e266} 2603558elseifi==EOFbreak; 26117cdelseifi!=9{ 262fea8printf"Fileformaterror.fscanf:%sn",strerrorerrno; 2639d9ereturn4; 264e266} 265da39 2664598/*Convertstringvalues*/ 267d1ccerrno=0; 268a705iferrno==0freq=strtofs.freq,NULL; 269a8c5iferrno==0err=strtofs.err,NULL; 2701938iferrno==0lgint=strtofs.lgint,NULL; 271af19iferrno!=0{ 272895eprintf"Fileformaterror.strtof:%sn",strerrorerrno; 2739d9ereturn4; 274e266} 275da39 2767e23/*Allocateadditionalmemory,ifneccessary*/ 2777d03if*count>=*size{ 2788ce6*size=*size+1*2; 2794300//printf"Increasingmemoryallocationto%un",thrs->compdatasize; 2804bc8*storage=realloc*storage,sizeofdatarow**size; 2819b7bif*storage==NULL{ 282dc4fprintf"Outofmemory:%sn",strerrorerrno; 283f242return5; 2846b4e} 285e266} 286da39 28709c5/*Storefrequency,intensityvalues*/ 2882b89if*count==0||lgint>maxlgintmaxlgint=lgint; 28984af*storage[*count].frequency=freq; 29050b3*storage[*count].error=err; 291c44f/*Exponentiationoccursduringscaling,below*/ 292146f*storage[*count].intensity=lgint; 293da39 2942bb6/*Doubleresonancefromtrailingdata*/ 295e6b2/*Ensurequantumnumbersareoftype"303"or"1404"*/ 296ac7cifstrcmps.qnfmt,"1404"==0qntype=4; 2973701elseifstrcmps.qnfmt,"303"==0qntype=3; 2983208ifqntype<1||qntype>QN_DIGITS{ 29972c7printf"CATfilehasunsupportedQNFMT'%s'.n",s.qnfmt; 3009d9ereturn4; 301e266} 302c219fori=0;i<2;i++{ 303bf0aforj=0;j PAGE 88 3056e4cstr[0]=s.qn[12*i+j*2]; 306d35astr[1]=s.qn[12*i+j*2+1]; 3072e59str[2]=0; 3084c56val=parseqnstr; 309d0d9//printf"'%s'n",trailing; 310289bifval==-9999return6; 3116677//printf"%d",val; 3123bd3*storage[*count].qn[i*QN_DIGITS+j]=val; 3136b4e} 314ba29/*Fillremainingplaceswith0.*/ 315f31cfor;j PAGE 89 4592383elsefprintffh,"0"; 460d384} 46153b3else{ 4625d5eprintf"template:Unknownescapetype%cn",type; 4638421returnretval; 464d384} 4652b9f} 466c0b5parsemode=0; 4672a9a} 46871adoldj=j+1; 469a670ifc==0{/*Endoftemplate.Nextsubfileisfirstone.*/ 4704073ifi==1subfile=0; 471c971break; 4722a9a} 473a055ifc=='$'{ 474abdd/*Endofsubfile.Ifnotgotcorrectsubfile,proceedonwards. 47528d0Otherwise,return,indicatinganothersubfileavailable.*/ 47661b2ifnrecord PAGE 90 613da59specopts_t*settings->ref->tempdir=optarg; 61422cdbreak; 61595d2case45:/*compress*/ 616ed08{ 617c987inti; 6188af8specopts_t*o=specopts_t*settings->ref; 619a79fo->compress=0; 620d1fdfori=1;compressors[i][0];i++{ 62198a0if!optarg||strcmpcompressors[i][COMP_APP],optarg==0|| 6224630strcmpcompressors[i][COMP_EXT]+1,optarg==0{ 623fd7do->compress=i; 6246075ifoptargbreak; 6252a9a} 6266b4e} 6275bb8ifo->compress==0{ 6285d78ifoptarg{ 629e76aifstrcmpoptarg,"help" 6309381printf"Unknowncompressionmethod%sn",optarg?optarg:""; 6314abdprintf"Knowncompressionmethods:"; 6329f48fori=1;compressors[i][0];i++{ 633deadprintf"%s",compressors[i][COMP_APP]; 6342b9f} 63513d7printf"n"; 6362a9a} 637738fexit; 6386b4e} 639e266} 64022cdbreak; 641062fcase46:/*errordecay*/ 642164cspecopts_t*settings->ref->errordecay=atofoptarg; 64322cdbreak; 6447490case47:/*binscale*/ 645c428specopts_t*settings->ref->binscale=atofoptarg; 64622cdbreak; 6475571case58:/*random-bins*/ 6483308specopts_t*settings->ref->randbins=optarg?atoioptarg:10; 64922cdbreak; 650ea16case59:/*cooperative-mode*/ 651bc29specopts_t*settings->ref->componentcount=atoioptarg; 65222cdbreak; 6531685default: 6548fd2printf"Aborting:%cn",c; 655ea2dabort; 65648d0} 65734e8return0; 658c2b7} 659da39 660a753intmainintargc,char*argv[]{ 6614c7eGA_settingssettings; 6622c69specopts_tspecopts; 6631ecatime_tstarttime=timeNULL;/*Tocomputeruntime*/ 664107cintrc=0; 6657ffdinti; 666d755/*Afterupdatingusagecomments,rungenerate-usage.plonthisfile 6673a9e*toupdateusagemessageheaderfiles. 6688f07*/ 669e777staticconststructoptionmy_long_options[]={ 670e011/**-o,--outputFILE 6710479* 672e0a3*UseFILE.{cat,int,out,var}fortemporaryandoutputstorage. 6733c5d*defaultisrandomnameoftheform"tmp-gaspec-XXX" 674562c*/ 67523f5{"output",required_argument,0,'o'}, 67654c8/**-t,--templateFILE 6770479* 6784f56*UseFILE.{int,var}asatemplatetogenerateSPCATinput. 679c776*default"template"*/ 6806901{"template",required_argument,0,'t'}, 681d211/**-m,--matchFILE 6820479* 683cbde*MatchagainstFILE,formattedasanSPCAT.catfile.default 684e9d5*"isopropanol.cat" 685562c*/ 686cb79{"match",required_argument,0,'m'}, 687bbd5/**-S,--spcatFILE 6880479* 689f4fb*SPCATprogramfile.default"./spcat"*/ 690bc79{"spcat",required_argument,0,'S'}, 691751f/**-b,--binsNUMBER 6920479* 6934241*Numberofbinsformatching.defaultattimeofwritingwas 6947791*600*/ 695a44a{"bins",required_argument,0,'b'}, 696487d/**-w,--weightNUMBER 6970479* 698e124*Weightofintensityvspeakcountwhenevaluatingeachbin.*/ 699fc29{"weight",required_argument,0,'w'}, 700255a/**-P,--popfileFILE 7010479* 70253bd*Filetoloadinitialpopulationfrom.Linesshouldbeformatted 703422c*as"%d%dGD%u%u%u"sameaspopulationoutputfile. 704562c*/ 7055c7c{"popfile",required_argument,0,'P'}, 706f3a0/**--aminNUMBER 7070479* 7084c57*MinimumAvalueinunitscompatiblewithtemplatefile. 709cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 710feea*fitsee--components. 711562c*/ 712ea3c{"amin",required_argument,0,20}, 71352b7/**--amaxNUMBER 7140479* 71538dd*MaximumAvalueinunitscompatiblewithtemplatefile. 716cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 717feea*fitsee--components. 718562c*/ 719f9ea{"amax",required_argument,0,21}, 7208421/**--bminNUMBER 7210479* 722a73f*MinimumBvalueinunitscompatiblewithtemplatefile. 723cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 724feea*fitsee--components. 725562c*/ 726af2b{"bmin",required_argument,0,22}, 727fb31/**--bmaxNUMBER 7280479* 7296a4a*MaximumBvalueinunitscompatiblewithtemplatefile. 730cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 731feea*fitsee--components. 732562c*/ 733a958{"bmax",required_argument,0,23}, 734846e/**--cminNUMBER 7350479* 736b8c3*MinimumCvalueinunitscompatiblewithtemplatefile. 737cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 738feea*fitsee--components. 739562c*/ 7404d6b{"cmin",required_argument,0,24}, 741c353/**--cmaxNUMBER 7420479* 743f918*MaximumCvalueinunitscompatiblewithtemplatefile. 744cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 745feea*fitsee--components. 746562c*/ 747fd9a{"cmax",required_argument,0,25}, 7482f89/**--djminNUMBER 7490479* 7509501*MinimumDJvalueinunitscompatiblewithtemplatefile. 751cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 752feea*fitsee--components. 753562c*/ 754092f{"djmin",required_argument,0,26}, 7554f11/**--djmaxNUMBER 7560479* 757122b*MaximumDJvalueinunitscompatiblewithtemplatefile. 758cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 759feea*fitsee--components. 760562c*/ 7616bf0{"djmax",required_argument,0,27}, 76263e5/**--djkminNUMBER 7630479* 764d0eb*MinimumDJKvalueinunitscompatiblewithtemplatefile. 765cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 766feea*fitsee--components. 80 PAGE 91 767562c*/ 7682760{"djkmin",required_argument,0,28}, 769887f/**--djkmaxNUMBER 7700479* 7717f94*MaximumDJKvalueinunitscompatiblewithtemplatefile. 772cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 773feea*fitsee--components. 774562c*/ 7755765{"djkmax",required_argument,0,29}, 776324f/**--dkminNUMBER 7770479* 7785668*MinimumDKvalueinunitscompatiblewithtemplatefile. 779cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 780feea*fitsee--components. 781562c*/ 782ff92{"dkmin",required_argument,0,30}, 7832700/**--dkmaxNUMBER 7840479* 785ee34*MaximumDKvalueinunitscompatiblewithtemplatefile. 786cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 787feea*fitsee--components. 788562c*/ 7890e53{"dkmax",required_argument,0,31}, 790dd0d/**--deljminNUMBER 7910479* 792ddfc*MinimumdelJvalueinunitscompatiblewithtemplatefile. 793cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 794feea*fitsee--components. 795562c*/ 796eb4b{"deljmin",required_argument,0,32}, 7972b49/**--deljmaxNUMBER 7980479* 799e373*MaximumdelJvalueinunitscompatiblewithtemplatefile. 800cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 801feea*fitsee--components. 802562c*/ 803deff{"deljmax",required_argument,0,33}, 804ad96/**--delkminNUMBER 8050479* 80619ea*MinimumdelKvalueinunitscompatiblewithtemplatefile. 807cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 808feea*fitsee--components. 809562c*/ 810733b{"delkmin",required_argument,0,34}, 81107f3/**--delkmaxNUMBER 8120479* 813f3f5*MaximumdelKvalueinunitscompatiblewithtemplatefile. 814cdc0*Maybespecifiedmultipletimes;onceforeachcomponentbeing 815feea*fitsee--components. 816562c*/ 8175640{"delkmax",required_argument,0,35}, 8189815/**--rangeminNUMBER 8190479* 82082a0*Minimumobservationrange. 821562c*/ 8226a66{"rangemin",required_argument,0,38}, 8239e7b/**--rangemaxNUMBER 8240479* 82578ff*Maximumobservationrange. 826562c*/ 827a8dd{"rangemax",required_argument,0,39}, 828445d/**--drfileFILE 8290479* 8304e46*Filecontainingdoubleresonancedata. 831562c*/ 83298fe{"drfile",required_argument,0,40}, 8334bc8/**--drtolTOLERANCE 8340479* 835f1e1*Matchingtolerance. 836562c*/ 83726c4{"drtol",required_argument,0,41}, 838382d/**--linkbc 8390479* 840045e*RunalgorithmusingB+CandB-CinsteadofBandC. 841562c*/ 842299d{"linkbc",no_argument,0,42}, 8438435/**--distributedDISTRIBUTOR 8440479* 845eff4*UseDISTRIBUTORfordistributedcomputation. 846562c*/ 8474149{"distributed",required_argument,0,43}, 84859cb/**--tempdirDIR 8490479* 850eb10*UseDIRfortemporaryfiles. 851562c*/ 852db46{"tempdir",required_argument,0,44}, 8531f8c/**--compressMETHOD 8540479* 8559e39*UseMETHODforcompressingfiles. 8567a32*ForalistofknownMETHODs,try--compresshelp. 857562c*/ 858f819{"compress",required_argument,0,45}, 8596ad1/**--errordecayFLOAT 8600479* 8611b60*DecaytheerrorestimateforSPCATatrateFLOAT 862562c*/ 863a7e5{"errordecay",required_argument,0,46}, 8648343/**--binscaleNUMBER 8650479* 8666d7b*Addthisnumberfloatofbinseachgeneration 867562c*/ 868198c{"binscale",required_argument,0,47}, 86926a3/**--random-bins 8700479* 87106a0*Usearandomnumberofbinseachgeneration 872562c*/ 873c115{"random-bins",required_argument,0,58},/*FIXME:Optionaldefault*/ 874797b/**--componentsNUMBER 8750479* 87677c1*Numberofcomponentstofit 877562c*/ 8780d25{"components",required_argument,0,59}, 879ab02{0,0,0,0} 8805f2f}; 8819bc3#ifdefCLIENT_ONLY 8820338FILE*config; 88313e3unsignedintgeneration=0; 88422c7#else 8851b00char*my_optstring="o:t:m:b:S:w:P:"; 8863033char*optlog=malloc;/*Initialallocation*/ 887da39 888d5eeif!optlog{printf"Outofmemoryoptlogn";exit;} 88964edoptlog[0]=0; 890da39 891d4b2GA_defaultsettings&settings; 892aa60settings.debugmode=0;/*Usenon-verboseoutput*/ 893b79csettings.popsize=64; 894e543settings.generations=200; 895236b#endif 896da39 897ae6esettings.ref=&specopts; 898505aspecopts.basename_out=NULL; 8995081specopts.template_fn="template-404"; 900d029specopts.obsfile="isopropanol-404.cat"; 901f254#ifdef_WIN32 902d5b2specopts.spcatbin="./spcat.exe"; 90322c7#else 904ed20specopts.spcatbin="./spcat"; 905236b#endif 9060755specopts.tempdir="temp"; 9072e0cspecopts.bins=BINS; 908d430specopts.distanceweight=1.0; 909f1ad/*Theseareresizedduringoptionparsing*/ 910f953specopts.userange=specopts.rangetemp=specopts.rangemin= 91113d6specopts.rangemax=specopts.initialerror=NULL; 9129a76specopts.rangesize=0; 913acfcrealloc_specopts_range&specopts,1;/*Initializetoonecomponent*/ 9142868specopts.errordecay=0; 915a645specopts.binscale=0; 916de9cspecopts.randbins=0; 917f1e7specopts.popfile=NULL; 9181c9bspecopts.drfile=NULL; 9190f13specopts.doubleres=NULL; 920a50cspecopts.doublereslen=0; 81 PAGE 92 92118bespecopts.doublerestol=2; 9227c37specopts.linkbc=0; 92397c0specopts.obsrangemin=RANGEMIN; 9249251specopts.obsrangemax=RANGEMAX; 9253f9fspecopts.distributor=NULL; 9264beaspecopts.compress=0; 9275b17specopts.componentcount=1; 928da39 9299bc3#ifdefCLIENT_ONLY 93062cfifargc!=3{ 931c16bprintf"Usage:%s PAGE 93 1075227cspecopts.observationcount=0; 1076da39 10779bc3#ifdefCLIENT_ONLY 1078f0f5specopts.basename_out=strdupargv[2]; 107922c7#else 1080146b/*Chooseanoutputfile*/ 10819777ifspecopts.basename_out==NULLspecopts.basename_out="."; 10821171{ 10831bc0DIR*x=opendirspecopts.basename_out; 108436dfifx!=NULL{/*Givenadirectory*/ 1085ba99char*dir=specopts.basename_out; 10867df4unsignedlen=strlendir+32; 1087b662specopts.basename_out=malloclen; 1088267dif!specopts.basename_out{ 1089b7c8printf"basenamemallocfailedn"; 1090738fexit; 10916b4e} 10927964ifsnprintfspecopts.basename_out,len,"%s/tmp-gaspec-%d", 1093221edir,getpid>=len{ 109454c7printf"basenameoverflown"; 1095738fexit; 10966b4e} 1097e266} 1098097celse{ 1099afa0/*Requiresdynamicallyallocatedstring*/ 11001e33specopts.basename_out=strdupspecopts.basename_out; 1101267dif!specopts.basename_out{ 11028237printf"Outofmemorybasenamedupn"; 1103738fexit; 11046b4e} 1105e266} 110648d0} 1107da39 1108de5c/*Debugmode*/ 11098db5if!settings.debugmode{ 1110ab1cFILE*fh; 11113419char*filename; 11123871constchar*compressext=compressors[specopts.compress][COMP_EXT]; 111324b3iffilename=mallocstrlenspecopts.basename_out+15==NULL{ 1114b1cbprintf"Outofmemorylogfilen"; 1115b289exit1; 1116e266} 1117206fsprintffilename,"%s.log%s",specopts.basename_out,compressext; 111867e0/*Openthelogfile*/ 1119334diffh=fopenfilename,"w"==NULL{ 11204bfeprintf"Failedtoopenlogfile:%sn",strerrorerrno; 11214cd1freefilename; 1122b289exit1; 1123e266} 112465ebfreefilename; 1125b25eifspecopts.compress{ 1126481f/*Spawncompressionprocess*/ 11275d34intpipes[2]; 11287988intcompresspid; 1129ca9fifpipepipes!=0{ 11302147printf"Failedtocreatepipeforcompression:%sn",strerrorerrno; 1131738fexit; 11326b4e} 11334444compresspid=fork; 11341147ifcompresspid==-1{ 1135b173printf"Failedforforkforcompression:%sn",strerrorerrno; 1136738fexit; 11376b4e} 1138ab50elseifcompresspid==0{ 1139e8c5char*argv[2]={compressors[specopts.compress][COMP_APP],NULL}; 1140b24c/*Childprocess*/ 11412779closepipes[1]; 114240b8/*MakethefilebeourSTDOUT,andthepipebeourSTDIN*/ 11432e14ifdup2filenofh,STDOUT_FILENO==-1{ 114430a2perror"Failedtosetstdinforcompressor"; 114556b9exit; 11462a9a} 11474a18ifdup2pipes[0],STDIN_FILENO==-1{ 114830a2perror"Failedtosetstdinforcompressor"; 114956b9exit; 11502a9a} 11512507/*XZabortsonCtrl-C,doesn'twritedata.IfparentgetsCtrl-C, 11522789*we'llgetEOFsoonenough.*/ 11533249signalSIGINT,SIG_IGN; 1154f488/*Executexz*/ 115548e2execvpargv[0],argv; 1156d525fprintfstderr,"Failedtoexecutecompressor%s:%sn", 1157722cargv[0],strerrorerrno; 1158738fexit; 11596b4e} 11600769/*ClosethelogfileanddupthepipetoSTDOUTinstead.*/ 116109c5fclosefh; 11624ba3closepipes[0]; 1163441biffh=fdopenpipes[1],"w"==NULL{ 11649ef3printf"Failedtoopenpipetocompressor:%sn",strerrorerrno; 11656b4e} 1166e266} 11673d35settings.logfh=fh; 1168839fqprintf&settings,"Detailssavedin%s.log%sn", 11695284specopts.basename_out,compressext; 117048d0} 1171236b#endif 1172797a#ifndef_WIN32 1173d29b/*Open/dev/nulltoallowhidingofSPCAToutput.Non-portable.*/ 11741171{ 117536afFILE*devnull; 1176daf4ifdevnull=fopen"/dev/null","w"==NULL{ 11776e97printf"Can'topen/dev/null:%sn",strerrorerrno; 1178b289exit; 1179e266} 1180b9e7ifspecopts.stdoutfd=dupfilenostdout==-1{ 11815eb6printf"Can'tdupstdout:%sn",strerrorerrno; 1182b289exit; 1183e266} 118450d6ifspecopts.devnullfd=filenodevnull==-1{ 1185512aprintf"Can'tgetfilenoof/dev/null:%sn",strerrorerrno; 1186b289exit; 1187e266} 118848d0} 1189236b#endif 1190180f#ifndefCLIENT_ONLY 1191c396lprintf&settings,"%sn",optlog+1;freeoptlog; 1192236b#endif 1193da39 1194a43fqprintf&settings,"Usingoutputfile%sn",specopts.basename_out; 1195da39 11968537/*LoadSPCAT.int&.varinputtemplatefiles*/ 11979b68lprintf&settings,"Loadingtemplatefiles%s.{var,int}n",specopts.template_fn; 11985b68ifrc=load_spec_templates&specopts!=0{ 11992ecaqprintf&settings,"load_spec_templatesfailed:%dn",rc; 1200d491returnrc; 120148d0} 12024a6c/*Loadobserveddatafile*/ 1203e60elprintf&settings,"Loadingobservationfile%sn",specopts.obsfile; 12045910ifrc=load_spec_observation&specopts!=0{ 12050214qprintf&settings,"load_spec_observationfailed:%dn",rc; 1206d491returnrc; 120748d0} 1208180f#ifndefCLIENT_ONLY 12098ea4/*Loadstartingpopulation*/ 121058d6ifspecopts.popfile{ 12110152lprintf&settings,"Loadinginitalpopulation%sn",specopts.popfile; 1212e263/*TODOcleanup*/ 1213583dspecopts.popdata=mallocsizeofGA_segment*specopts.componentcount* 1214e586SEGMENTS*settings.popsize; 1215436dif!specopts.popdata{ 121601dbqprintf&settings,"loadpopdatafailedn"; 1217a5bcreturn1; 1218e266} 1219dda2memsetspecopts.popdata,0,sizeofspecopts.popdata; 12209765FILE*fh=fopenspecopts.popfile,"r"; 12219377unsignedintidx=0; 12223ff6whileidx PAGE 94 12292a9a} 1230e622}elserc=0;/*Pretendtobeanerrorfromtheinnerloop*/ 12310080ifrc==0{ 123227eeqprintf&settings,"loadpopdatafailed:EOFatidx=%un",idx; 123361dbreturn1; 12346b4e} 1235dd55idx++; 1236e266} 1237c09bfclosefh; 12383d18/*DisplaytheloadedpopulationIdon'tfeellikemigratingthisto 12397f9d*SEGMENTcount-independentformFIXME*/ 1240154b/* 12412af7foridx=0;idx PAGE 95 138375d1ifoutfile=fopenspecopts.basename_out,"w"==NULL{ 1384766cprintf"Couldnotopenoutputfile%s:%sn", 13856ddcspecopts.basename_out,strerrorerrno; 1386b289exit1; 1387e266} 138867ed/*Runthepre-generationhandler*/ 13897053ifrc=GA_starting_generation&ga!=0{ 139072ecprintf"Couldnotstartgeneration,rc=%dn",rc; 1391b289exit1; 1392e266} 13935cf5/*Startreadloop*/ 139436bfforpos=0;pos PAGE 96 1537ca9eopts->rangemin[i],x[i],opts->rangemax[i];*/ 15385461return0; 1539e266} 154048d0} 1541522breturn1; 1542c2b7} 1543da39 1544c968intGA_terminationconstGA_session*ga{ 15451e75ifga->population[ga->fittest].unscaledfitness>-0.00001 154679b3return1; 154734e8return0; 1548c2b7} 1549929b#endif/*notCLIENT_ONLY*/ 1550da39 15511a22intGA_starting_generationGA_session*ga{ 1552cba4specopts_t*opts=specopts_t*ga->settings->ref; 15535fb1/*Updatebinsize*/ 15540e1cintoldbins=opts->scaledbins; 1555ca9aopts->scaledbins=opts->bins; 1556180f#ifndefCLIENT_ONLY 15575e39ifopts->randbins!=0{/*Randombinning*/ 155861d0opts->scaledbins=opts->randbins+GA_randga%opts->bins; 155948d0} 1560236b#endif 1561c264ifopts->binscale!=0/*Scaledbins*/ 1562138eopts->scaledbins=opts->bins+ga->generation*opts->binscale; 15630dd2ifoldbins!=opts->scaledbins{ 15646831qprintfga->settings,"Nowusing%dbinsn",opts->scaledbins; 1565180f#ifndefCLIENT_ONLY 1566e83bifga->settings->distributor 1567d865fprintfga->settings->distributor,"CFGSbins%unV%sn", 15685ba6opts->scaledbins,CHECKSUM; 1569236b#endif 157048d0} 157134e8return0; 1572c2b7} 1573da39 15741db4typedefstruct{unsignedintindex;doubletotal;}sortable_bin; 1575da39 15762b98intbin_comparatorconstvoid*a,constvoid*b{ 1577a0efsortable_binx=*sortable_bin*a,y=*sortable_bin*b; 15789685ifx.total PAGE 97 16918e5fi=invisible_systemopts->devnullfd,2,opts->spcatbin,filename; 16921aadifi!=0{ 169352efqprintfga->settings,"Failedtostartspcat--spcattospecifypathn"; 16941756return11; 1695e266} 16962a7aif!WIFEXITEDi||WEXITSTATUSi!=0{ 1697f171qprintfga->settings,"spcatdidnotreturnsuccessn"; 16980edcreturn10; 1699e266} 1700236b#endif 1701da39 17029b79/*ReadSPCAToutputfile*/ 17037cb3snprintffilename,sizeoffilename,"%s.cat",thrs->basename_temp; 1704c491iffh=fopenfilename,"r"==NULL{ 1705b4a5qprintfga->settings,"Failedtoopencatfile:%sn",strerrorerrno; 17065e6ereturn15; 1707e266} 1708236b#endif 1709da39 1710d8c8i=load_catfilefh,&thrs->compdata,&thrs->compdatasize, 171168f8&thrs->compdatacount; 1712fe52ifi>0return20+i; 1713c09bfclosefh; 1714da39 1715c50b/*Isthereanothersubfileweneedtoprocess?*/ 17167308ifrc==0break; 171748d0} 1718da39 17193452/*Determinefitness*/ 1720d7ea//tprintf"COUNTS:%d%dn",opts->observationcount,thrs->compdatacount; 1721fd16fitness=0; 1722da39 172378e4/*Checkdoubleresonance*/ 1724da39 1725c487/*BUG--Needtocheckagainst*either*ofthedblresesoffirstitem 1726ce89*--theydon'tneedtotoallbethesame.*/ 17274cd9/*Isn'tthatfixed?*/ 1728da39 17291b12intdrfail=0; 17302546fori=0;i PAGE 98 1845bd09/*We'rewithinthevalidrange*/ 18462145intbin=floorentry->frequency-opts->obsrangemin/binsize; 1847adecifbin>=scaledbinsbin=scaledbins-1; 1848fdeaifj==0{ 1849271dobsbin[bin]+=entry->intensity; 18503442obsbincount[bin]++; 18519b07//ifi==0||entry.b>obsmaxobsmax=entry.b; 18526b4e} 18531e8celse{ 1854617e/*Skippeaksthataretooimprecisetobin.20110910*/ 18551bfb/*Unfortunately,weneedtoskipmanyfewerpeaks.*/ 1856e427ifentry->error>errtolcontinue; 18577b14//doubleweight=fabsentry.b/obsmax; 1858969d//ifweight<1weight=1; 1859e115/*Prediction-scaleme*/ 18603590compbin[bin]+=entry->intensity;//*weight; 186179d3compbincount[bin]++; 1862ae8b//printf"BW:sqrt/%dn",entry.qn[0]+entry.qn[3]; 1863352dbinweights[bin]+=sqrt.0/entry->qn[0]+entry->qn[3];/*20110215*/ 18642467binerror[bin]+=entry->error; 18656b4e} 1866e266} 186748d0} 1868a718#if0/*EXPERIMENTALBEHAVIOR2010-09-26*/ 1869fdc1doublebinweights[scaledbins]; 1870fb09sortable_binbinorder[scaledbins]; 1871959cfori=0;i PAGE 99 1999f22bsigaddset&sa.sa_mask,SIGCHLD; 200088bfsigprocmaskSIG_BLOCK,&sa.sa_mask,&saveblock; 20015d8d*/ 2002da39 2003ce09/*LOCKio*/ 2004c7c2#ifTHREADS&&!CLIENT_ONLY 20058309stat=pthread_mutex_lock&GA_iomutex; 2006efa5ifstat{printf"tprintf:mutex_lockio:%dn",stat;exit;} 2007236b#endif 2008ec3bflockfilestdout; 2009f376fflushNULL; 20102cb0//gettimeofday&starttime,NULL;/*Tocomputeruntime*/ 201115faifpid=fork==0{ 2012154b/* 2013d98bsigactionSIGINT,&savintr,structsigaction*; 20148b2bsigactionSIGQUIT,&savequit,structsigaction*; 20154618sigprocmaskSIG_SETMASK,&saveblock,sigset_t*; 20165163*/ 2017da39 2018f998/*Disablestdout*/ 20196431ifdup2stdoutfd,STDOUT_FILENO==-1_exit; 20201989funlockfilestdout; 2021da39 2022db44execvargv[0],argv; 20232291/*execl"/bin/sh","sh","-c",arg,char*;*/ 2024abfa_exit; 202548d0} 2026fbe2/*UNLOCKio*/ 20275be5funlockfilestdout; 2028c7c2#ifTHREADS&&!CLIENT_ONLY 20297770stat=pthread_mutex_unlock&GA_iomutex; 203082e4ifstat{printf"tprintf:mutex_unlockio:%dn",stat;exit;} 2031236b#endif 20324b68freeargv; 2033da39 20341e15ifpid==-1{ 20354689printf"invisible_system:forkfailed:%sn",strerrorerrno; 20364399stat=-1;/*errnocomesfromfork*/ 2037ea48}else{ 2038577dwhilewaitpidpid,&stat,0==-1{ 203920a7iferrno!=EINTR{ 20408d2dstat=-1; 20414968printf"invisible_system:waitpidfail:%sn",strerrorerrno; 20427acdbreak; 20436b4e} 2044e266} 204548d0} 2046f0f5//gettimeofday&endtime,NULL; 20478252//printf"Systemtook%fseconds.n", 2048d229//timeval_diffNULL,&endtime,&starttime/1000000.0; 2049c379/* 205002acsigactionSIGINT,&savintr,structsigaction*; 205106b3sigactionSIGQUIT,&savequit,structsigaction*; 20528958sigprocmaskSIG_SETMASK,&saveblock,sigset_t*0; 20535d8d*/ 2054da39 20552cfcreturnstat; 2056c2b7} 2057236b#endif File2: ga.c 19f7e/**filega.c 25a96* 3205c*Geneticalgorithmimplementationfile. 4c4d2*/ 5d422#include PAGE 100 8857a3session->settings=settings; 895409session->fittest=0; 90f4b8/*AllocatethepopulationarrayfreedinGA_cleanup*/ 91f2easession->population=mallocsizeofGA_individual*settings->popsize; 924242if!session->populationreturn1; 931fecsession->oldpop=mallocsizeofGA_individual*settings->popsize; 941c46if!session->oldpopreturn2; 952d5b/*AllocateandfillthesortedlistfreedinGA_cleanup*/ 9649c8session->sorted=mallocsizeofunsignedint*settings->popsize; 97a32fif!session->sortedreturn3; 98ff55/*Initializeeachindividual*/ 999ebdfori=0;i PAGE 101 242d294/*Insertnewsegments*/ 243679bsession->population[i].segments[j]= 24427e9session->oldpop[session->sorted[i]].segments[j]; 245bb0dsession->population[i].gdsegments[j]= 2461a4fsession->oldpop[session->sorted[i]].gdsegments[j]; 2476b4e} 248e266} 2497a2e/*Createanewpopulationbyroulettewheel. 2501d68Consider:Tophalfroulettewheel/Keeptop8?*/ 25107c3GA_generatesession,i; 252da39 25325f8/*Displaycachebuckets*/ 254154b/* 2559867printf"BKTS%03d",session->generation; 256d7d9fori=0;i PAGE 102 396c2b7} 397da39 3985357staticvoidGA_cache_fitnessGA_session*session,unsignedinti, 399e35bunsignedinthashbucket{ 40098abGA_segment*ptr; 401503cif!session->fitnesscachereturn; 402f46f#ifTHREADS 4037322/*LOCKfitnesscache*/ 404c661intj=pthread_mutex_lock&session->cachemutex; 4051695ifj{qprintfsession->settings, 406e4f6"GA_dcf:mutex_lockcache_w:%dn",j;exit;} 407236b#endif 40807e0ifsession->fitnesscache[hashbucket][0].unscaledfitness!=0{ 409c438/*Firstdemotethefirst-levelhashtosecond-level*/ 410227cptr=session->fitnesscache[hashbucket][1].segments; 41167bamemcpy&session->fitnesscache[hashbucket][1], 4123dae&session->fitnesscache[hashbucket][0],sizeofGA_individual; 41390f6session->fitnesscache[hashbucket][1].segments=ptr; 414519ememcpyptr,session->fitnesscache[hashbucket][0].segments, 41535a5sizeofGA_segment*session->population[i].segmentcount; 41648d0} 417c968ptr=session->fitnesscache[hashbucket][0].segments; 4188123memcpy&session->fitnesscache[hashbucket][0], 419a222&session->population[i],sizeofGA_individual; 42060eesession->fitnesscache[hashbucket][0].unscaledfitness=1; 421fd13session->fitnesscache[hashbucket][0].segments=ptr; 4225803memcpyptr,session->population[i].segments, 42399besizeofGA_segment*session->population[i].segmentcount; 424f46f#ifTHREADS 425867e/*UNLOCKfitnesscache*/ 426e1baj=pthread_mutex_unlock&session->cachemutex; 4271695ifj{qprintfsession->settings, 4286097"GA_dcf:mutex_unlockcache_w:%dn",j;exit;} 429236b#endif 430c2b7} 431da39 432fc67staticunsignedintGA_hash_individualGA_session*session,unsignedinti{ 43361d0GA_segmenthashtemp=0; 4342443intj; 4355f16if!session->fitnesscachereturn0; 436da39 437e9b3forj=0;j PAGE 103 5506177ifrc 551d87a{qprintfsession->settings, 552d685"GA_do_thread:mutex_unlockout:%dn",rc;exit;} 5537e22/*printf"Outputcompletedn";*/ 554c2b7} 555da39 556bfaestaticvoid*GA_do_threadvoid*arg{ 5576aa9GA_thread*thread=GA_thread*arg; 5580bfcGA_session*session=thread->session; 559461ewhile1{ 560d909intin,found,last; 5619118intrc; 562da39 563171d/*Findajobtoprocess*/ 564c0ba/*printf"Waitingformutex...n";*/ 565e238rc=pthread_mutex_lock&session->inmutex; 566318eifrc{qprintfsession->settings, 5674433"GA_do_thread:mutex_lockin:%dn",rc;exit;} 5682a3a/*Waituntilthejob-availableflagisset*/ 5693760while!session->inflag{ 5703f58/*printf"Waitingforcond...n";*/ 571c84drc=pthread_cond_wait&session->incond,&session->inmutex; 572b816ifrc{qprintfsession->settings, 573ba47"GA_do_thread:cond_waitin:%dn",rc;exit;} 574e266} 575b0cain=session->inindex;/*Wegotdatatoprocess!*/ 576da39 577d4c0/*Indistributedmode,we'llhandletheentirepopulationinthis 5783a5a*thread.*/ 579eab9ifsession->settings->distributorlast=session->settings->popsize; 5801ed4elselast=in+1; 5814f4asession->inindex=last; 582da39 583df14/*Isthereanotherjobtodispatch?*/ 584e1a5ifsession->inindex PAGE 104 7040d25ifcfinite>0{ 7050cf9j=session->settings->popsize; 706a0a1fori=0;i PAGE 105 8587bb2/*Didnotchoosesameitem.Dofullcomparison,returnerroriffails*/ 859af38/*FIXME-Completelybroken!!!Failsforga-numbers*/ 860635c/*Doesthisworknow?*/ 86164a2intmc=memcmp&session->population[session->sorted[0]].segments[0], 862d2c3&session->population[session->fittest].segments[0], 8639ac3sizeofGA_segment*session->population[session->fittest]. 864fe54segmentcount; 8653d26ifmc{ 86677f6qprintfsession->settings,"Sortfailed:S=%uvsF=%un", 8674acasession->sorted[0],session->fittest; 868ceedreturn2; 869e266} 87048d0} 871bcbf/*Displaythebestindividual*/ 872017ddisplay_individualsession,session->fittest,1,"BEST"; 873c8e9lprintfsession->settings,"n"; 87434e8return0; 875c2b7} 876da39 8778574GA_segmentgraydecodeGA_segmentgray{ 8782a9fGA_segmentbin; 8799752#ifGA_segment_size==32 8800c7e/*Optimization:b=b^g>>1;b=b^b>>2;...4;...8;...16; 8810449IFBIT,...32--canbemuchfaster.*/ 8822589bin=gray; 88301debin^=bin>>1; 8847a35bin^=bin>>2; 88539f6bin^=bin>>4; 886fd48bin^=bin>>8; 887e31ebin^=bin>>16; 88822c7#else 8891247forbin=0;gray;gray>>=1bin^=gray; 890236b#endif 891d7fareturnbin; 892c2b7} 893da39 894b58dGA_segmentgrayencodeGA_segmentbin{ 8955ca9returnbin^bin>>1; 896c2b7} 897da39 8981807unsignedinturandom{ 899fedfunsignedintout; 900e529FILE*fh; 901d97eif!fh=fopen"/dev/urandom","r"return0; 90219a4iffread&out,sizeofout,1,fh!=1return0; 9032966fclosefh; 904adf1returnout; 905c2b7} 906da39 9079757/**Appendsecondstringintofirst.Repeatedappendstothesame 908e2ca*stringarecached,makingthisanOnfunction,wheren= 9098816*strlenappend. 9105a96* 911413e*Frompublicdomainfilesrc/usr.bin/sdiff/sdiff.cin 9127e13*ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-current/tar_files/src/usr.bin.tar.gz 913c4d2*/ 914c251staticintastrcatchar**s,constchar*append{ 915bcc7/*Lengthofstringinpreviousrun.*/ 9167f80staticsize_toffset=0; 917a83fsize_tnewsiz; 91831b2/*Stringfrompreviousrun.Comparedto*stoseeifweare 919c2b7*dealingwiththesamestring.Ifso,wecanuseoffset.*/ 920da39 9217919/*nandhp:Thisdoesn'tworkifanewstringisallocatedatthesame 922803b*addressasthepreviousone.Thiscausesdisplayproblems.FIXME*/ 923207c/*static*/constchar*oldstr=NULL; 924a4afchar*newstr; 925da39 926c024/*FirststringisNULL,sojustcopyappend.*/ 9273e9eif!*s{ 92850a5if!*s=strdupappendreturn1; 929da39 9306ba2/*Keeptrackofstring.*/ 9310a56offset=strlen*s; 932219coldstr=*s; 933da39 934253areturn0; 93548d0} 936da39 93793b5/**sisastringsoconcatenate.*/ 938da39 9396765/*Didweprocessthesamestringinthelastrun?Ifthisisa 9407c8c*differentstringfromtheonewejustprocessedcachenew 9412f19*string.*/ 942a90difoldstr!=*s{ 9430a56offset=strlen*s; 944219coldstr=*s; 94548d0} 946da39 9470845/*Size=strlen*s+n+strlenappend+''.*/ 9487092newsiz=offset+1+strlenappend+1; 949da39 9501301/*Resize*stofitnewstring.*/ 951cc85newstr=realloc*s,newsiz; 9520fe3ifnewstr==NULLreturn2; 953071c*s=newstr; 954da39 95555dc/**s+offsetshouldbeendofstring.*/ 9561945/*Concatenate.*/ 957478estrncpy*s+offset,"n",newsiz-offset; 9584389strncat*s+offset,append,newsiz-offset; 959da39 9602ba8/*Newstringlengthshouldbeexactlynewsiz-1characters.*/ 9617d46/*Storegeneratedstring'svalues.*/ 962ab03offset=newsiz-1; 9635f4coldstr=*s; 96434e8return0; 965c2b7} 966da39 967e36f/**HelperfunctionforGA_getopt. 9685a96* 969e0da*paramargc,argvCommand-linearguments,frommain. 970173b*paramsettingsGA_settingsargumenttostoreconfigurationin. 9719b8a*paramoptstringgetoptoptstring. 972ab3a*paramlong_optionsgetoptoptionslist. 9736340*paramglobal_countThenumberofnon-problem-specificoptions. 974cf61*parammy_parse_optionOptionparsingfunctionforproblem-specificoptions. 975fe70*parammy_usageUsagemessageforproblem-specificoptions. 97638d6*paramoptlogReallocablestringtologoptionsinto,orNULL. 9779913*paramoptlogtypeOriginofoption"F"ileorCommand-line"A"rguments 9785a96* 9791165*seeGA_getopt 9805a96* 9810948*returns0forsuccess. 982c4d2*/ 9837cacintGA_run_getoptintargc,char*constargv[],GA_settings*settings, 98408d5constchar*optstring,conststructoption*long_options, 98578c1intglobal_count,GA_my_parseopt_tmy_parse_option, 9863443constchar*my_usage,char**optlog,charoptlogtype{ 987c501intc; 9883065while{ 989d79f/*getopt_longstorestheoptionindexhere.*/ 9900377intoption_index=0; 991da39 99272dc/*tprintf"%d%s%sn%d%xn%dn",argc,argv[1],argv[2],optind,argv;*/ 9935c65c=getopt_longargc,argv,optstring,long_options,&option_index; 994da39 995f2fd/*Recordtooptlog*/ 996cea9ifoptlog&&*optlog{ 997c798char*msg=NULL; 9986b73intrc=0; 99938e4ifc==0 1000164frc=asprintf&msg,"CFG%c%s%s",optlogtype, 1001f87flong_options[option_index].name,optarg?optarg:""; 10024d69else{ 1003f66ainti=0; 1004e7abfori=0;long_options[i].name;i++{ 1005a200iflong_options[i].flag==NULL&&long_options[i].val==c{ 10063745rc=asprintf&msg,"CFG%c%s%s",optlogtype, 1007cde5long_options[i].name,optarg?optarg:""; 10081378break; 1009f584} 10106354} 101115d9} 95 PAGE 106 10125c5aifmsg&&rc>0{astrcatoptlog,msg;freemsg;} 1013976f} 1014da39 1015d830/*Detecttheendoftheoptions.*/ 101604deifc==-1 10179397break; 101805daswitchc{ 10195956case0: 10207267ifoption_index PAGE 107 11660479* 1167b7df*Enabledynamicmutationincreasemutationratewhenfitnessis 1168c262*stagnant.--mutationratehaslittleeffectinthismode. 11690479* 1170bc63*Howitworks:Basically,itmaintains"leading"and"trailing"values, 11715709*withthetrailingvalueTequaltotheleadingvaluefromw"width" 11721715*generationsago,andleadingvalueLisupdatedeachgenerationby 11739593*L=L*f-1/f+F,whereFistheaveragefitnessofthegenerationandf 11743a04*isthe"factor".FromthisanewmutationrateMiscomputedby 11757335*M=m+exp-|D|*r,whereD=L-T/f,mis"min"andris"range". 1176562c*/ 1177298c{"dynamic-mutation",no_argument,0,10}, 11782a50/**--dynamic-mutation-widthNUMBER 11790479* 1180c99b*Noeffectunlessdynamicmutationisenabled. 11815fc2*Widthforconsiderationofchangeinfitness. 1182562c*/ 11838848{"dynamic-mutation-width",required_argument,0,11}, 1184958d/**--dynamic-mutation-factorNUMBER 11850479* 1186c99b*Noeffectunlessdynamicmutationisenabled. 11874929*Arbitraryfactorusedincomputationofdynamicmutation. 1188562c*/ 1189f8af{"dynamic-mutation-factor",required_argument,0,12}, 11900ba7/**--dynamic-mutation-minNUMBER 11910479* 1192c99b*Noeffectunlessdynamicmutationisenabled. 11937441*Minimummutationrate,in0-1. 1194562c*/ 1195b4ee{"dynamic-mutation-min",required_argument,0,13}, 1196dee3/**--dynamic-mutation-rangeNUMBER 11970479* 1198c99b*Noeffectunlessdynamicmutationisenabled. 1199e1d3*Rangeofmutationrates,in0-1Maximummutationrateis 12007093*minimum+range,whichmustbelessthanorequalto1. 1201562c*/ 120273fe{"dynamic-mutation-range",required_argument,0,14}, 1203154b/* 1204df29{"verbose",no_argument,&verbose_flag,1}, 1205eff2{"brief",no_argument,&verbose_flag,0}, 12065163*/ 1207ab02{0,0,0,0} 12085f2f}; 12097ed9staticconstchar*global_optstring="c:g:p:s:hDT:E:M:W:"; 1210c501intc; 12115a23structoption*long_options; 12123903structoptionconfig_options[2]={global_long_options[0],{0,0,0,0}}; 12130d1achar*optstring=NULL; 1214444achar**temp_argv=mallocsizeofchar**argc;/*NULLcheckisbelow*/ 12151b83char*loadconfig=NULL; 1216da39 1217053a/*Combineglobal_long_optionsandmy_long_options*/ 1218982eintglobal_count=0; 12196ae6intmy_count=0; 12203f24forglobal_count=0;;global_count++/*Donotinclude0incount*/ 12211a41ifglobal_long_options[global_count].name==0break; 122253f6formy_count=1;;my_count++/*Include0incount*/ 12231c1difmy_long_options[my_count-1].name==0break; 12244e57long_options=mallocsizeofstructoption*global_count+my_count; 1225b84eif!long_options{printf"Outofmemorylong_optionsn";return1;} 12264ec3memcpylong_options,global_long_options, 12277f1esizeofstructoption*global_count; 12287e50memcpylong_options+global_count,my_long_options, 12298178sizeofstructoption*my_count; 12302a28/*Combineglobal_optstringandmy_optstring*/ 123160eaoptstring=mallocstrlenglobal_optstring+strlenmy_optstring+1; 12327ee7if!optstring{printf"Outofmemoryoptstringn";return1;} 1233d78dstrcpyoptstring,global_optstring; 12349855strcatoptstring,my_optstring; 1235da39 1236a122/*Searchforconfigfilespecifications*/ 1237c7acoptind=1;opterr=0; 12388991/*Usetemp_argvduetopermutationbygetopt*/ 1239d645if!temp_argv{printf"Outofmemorytemp_argvn";return1;} 1240135bmemcpytemp_argv,argv,sizeofchar**argc; 12414589/*Searchforconfigfileparameter*/ 12423065while{ 1243d79f/*getopt_longstorestheoptionindexhere.*/ 12440377intoption_index=0; 1245da39 12466e99c=getopt_longargc,temp_argv,"c:",config_options, 12470254&option_index; 1248c34cifc=='c'{ 124982c7ifloadconfig{ 125032b3printf"Toomanyconfigurationfilesspecifiedn"; 1251be5dexit; 125215d9} 1253e6f0loadconfig=optarg; 1254976f} 12552ba8elseifc==-1/*Detecttheendoftheoptions.*/ 12569397break; 125748d0} 1258499afreetemp_argv; 1259d938opterr=1; 12602c19/*Nowloadconfigurationfile*/ 1261ff96ifloadconfig{ 1262ab1cFILE*fh; 1263a913iffh=fopenloadconfig,"r"==NULL{ 12642c46char*str; 12654b3aasprintf&str,"%s:%s",argv[0],loadconfig; 1266eb1bperrorstr; 1267e490freestr; 1268b289exit; 1269e266} 12703384/*printf"Loadingconfigfile%sn",loadconfig;*/ 12718316while1{ 12721481char*key=malloc*sizeofchar; 12734ebcchar*value=malloc*sizeofchar; 127403a3typedefchar*foo; 1275407ffoofake_argv[4]; 127648f0fake_argv[2]=fake_argv[3]=0; 1277392fif!key||!value{ 1278bf81printf"%s:%s:Allocationerror",argv[0],loadconfig; 1279738fexit; 12806b4e} 12811127intrc=fscanffh,"%511s%*[t]%4095[^n]",key,value; 128236cd/*printf"%d%p%pn",rc,key,value;*/ 12832d44ifrc==EOF&&ferrorfh{ 1284678achar*str; 12856f8aasprintf&str,"%s:%s",argv[0],loadconfig; 1286a4aeperrorstr; 1287079afreestr; 1288738fexit; 12896b4e} 12902061elseifrc==EOFbreak; 129149b0elseifrc>=1&&key[0]=='#'||key[0]==';'||key[0]=='%'|| 1292e795key[0]=='!'{/*Commoncommentcharacters*/ 12933151freekey; 12941f87freevalue; 129515f0continue; 12966b4e} 1297f1daelseifrc<1||rc>2||!key{ 12989e10printf"%s:%s:Syntaxerrorn",argv[0],loadconfig; 1299738fexit; 13006b4e} 1301e6b5/*printf"'%s''%s'n",key,value?value:"";*/ 13025b52fake_argv[0]=argv[0]; 1303adcbifasprintf&fake_argv[1],"--%s",key==-1{ 13048ab3printf"%s:%s:OutofMemoryn",argv[0],loadconfig; 1305738fexit; 13066b4e} 13077708ifrc>=2fake_argv[2]=value; 13088ef7optind=1; 1309d036GA_run_getoptrc+1,fake_argv,settings,optstring,long_options, 1310781bglobal_count,my_parse_option,my_usage,optlog,'F'; 1311dccc/*freefake_argv[1];*/ 1312b6d3freekey; 1313e25d/*freevalue;*//*Don'tfreevalue,wemightkeepapointertoit*/ 1314e266} 1315c09bfclosefh; 131648d0} 1317da39 1318e740optind=1; 13191618GA_run_getoptargc,argv,settings,optstring,long_options, 97 PAGE 108 13206e84global_count,my_parse_option,my_usage,optlog,'A'; 13218b9efreeoptstring; 13228f2afreelong_options; 132334e8return0; 1324c2b7} 1325da39 1326da60intqprintfconstGA_settings*settings,constchar*format,...{ 132780aava_listap; 1328107cintrc=0; 1329f46f#ifTHREADS 13300905{/*LOCKio*/ 13312a83inttrc=pthread_mutex_lock&GA_iomutex; 13329724iftrc{printf"qprintf:mutex_lockio:%dn",trc;exit;} 133348d0} 1334ec3bflockfilestdout; 13356489/*flockfilesettings->stdoutfh;*/ 1336236b#endif 13378b85va_startap,format; 1338c2ferc=vprintfformat,ap; 133968acva_endap; 1340b10aifsettings->logfh{ 1341ac33va_startap,format; 1342a184rc=vfprintfsettings->logfh,format,ap; 1343f475va_endap; 134448d0} 1345f46f#ifTHREADS 1346b1be/*funlockfilesettings->stdoutfh;*/ 13475be5funlockfilestdout; 1348821e{/*UNLOCKio*/ 1349e98cinttrc=pthread_mutex_unlock&GA_iomutex; 13505425iftrc{printf"qprintf:mutex_unlockio:%dn",trc;exit;} 135148d0} 1352236b#endif 13539cadreturnrc; 1354c2b7} 1355da39 1356e83cinttprintfconstchar*format,...{ 135780aava_listap; 1358107cintrc=0; 1359f46f#ifTHREADS 13600905{/*LOCKio*/ 13612a83inttrc=pthread_mutex_lock&GA_iomutex; 1362cacdiftrc{printf"tprintf:mutex_lockio:%dn",trc;exit;} 136348d0} 1364ec3bflockfilestdout; 1365236b#endif 13668b85va_startap,format; 1367c2ferc=vprintfformat,ap; 136868acva_endap; 1369f46f#ifTHREADS 13705be5funlockfilestdout; 1371821e{/*UNLOCKio*/ 1372e98cinttrc=pthread_mutex_unlock&GA_iomutex; 1373046fiftrc{printf"tprintf:mutex_unlockio:%dn",trc;exit;} 137448d0} 1375236b#endif 13769cadreturnrc; 1377c2b7} 1378da39 13792379intlprintfconstGA_settings*settings,constchar*format,...{ 138080aava_listap; 1381107cintrc=0; 1382f46f#ifTHREADS 13830905{/*LOCKio*/ 13842a83inttrc=pthread_mutex_lock&GA_iomutex; 1385cacdiftrc{printf"tprintf:mutex_lockio:%dn",trc;exit;} 138648d0} 1387ec3bflockfilestdout; 1388236b#endif 13898b85va_startap,format; 13900ef1rc=vfprintfsettings->logfh?settings->logfh:stdout,format,ap; 139168acva_endap; 1392f46f#ifTHREADS 13935be5funlockfilestdout; 1394821e{/*UNLOCKio*/ 1395e98cinttrc=pthread_mutex_unlock&GA_iomutex; 1396046fiftrc{printf"tprintf:mutex_unlockio:%dn",trc;exit;} 139748d0} 1398236b#endif 13999cadreturnrc; 1400c2b7} 1401da39 14022c93/*Fromhttp://www.linuxquestions.org/questions/programming-9/how-to-calculate14032314time-difference-in-milliseconds-in-c-c-711096/http://tinyurl.com/7bjz2u3 1404c4d2*/ 14059642longlong 1406d86etimeval_diffstructtimeval*difference, 14076f5estructtimeval*end_time, 14080b68structtimeval*start_time 14099ad7 141060ba{ 14115576structtimevaltemp_diff; 1412da39 1413b237ifdifference==NULL 14141171{ 14150a26difference=&temp_diff; 141648d0} 1417da39 14187fc5difference->tv_sec=end_time->tv_sec-start_time->tv_sec; 141934d6difference->tv_usec=end_time->tv_usec-start_time->tv_usec; 1420da39 142138f5/*Usingwhileinsteadofifbelowmakesthecodeslightlymorerobust.*/ 1422da39 1423a7a4whiledifference->tv_usec<0 14241171{ 14253330difference->tv_usec+=1000000; 1426ef8adifference->tv_sec-=1; 142748d0} 1428da39 1429a447return1000000LL*difference->tv_sec+ 1430ae71difference->tv_usec; 1431da39 14325575}/*timeval_diff*/ 1433da39 1434adf0/*PRNGinterface*/ 14357547#ifHAVE_GSL 14369959/*UseGSLtoreproducetheglibc2random_rcall.Wecaneasilyreplacethis 14373309*withadifferentPRNG.*/ 14382aa8staticintGA_rand_initGA_session*session,unsignedlongintseed{ 1439a908session->r=gsl_rng_allocgsl_rng_mt19937;//gsl_rng_random_glibc2; 144033f5if!session->rreturn1; 14413ad7gsl_rng_setsession->r,seed; 144234e8return0; 1443c2b7} 1444da39 1445dffdunsignedintGA_randGA_session*session{ 1446e30areturnunsignedintgsl_rng_getsession->r;/*unsignedlongint*/ 1447c2b7} 1448da39 14496949doubleGA_rand_doubleGA_session*session{ 14504ce9returngsl_rng_uniformsession->r; 1451c2b7} 1452da39 145322c7#else 14547e8a/*Initializerandom_r*/ 1455c4f6staticint32_trandtbl[32]={ 145624373, 1457a08c-1726662223,379960547,1735697613,1040273694,1313901226, 145870681627687941,-179304937,-2073333483,1780058412,-1989503057, 1459792a-615974602,344556628,939512070,-1249116260,1507946756, 1460989d-812545463,154635395,1388815473,-1926676823,525320961, 14619b44-1009028674,968117788,-123449607,1284210865,435012392, 1462e76e-2017506339,-911064859,-370259173,1132637927,1398500161, 146398c1-205601318, 1464479f}; 1465da39 14662aa8staticintGA_rand_initGA_session*session,unsignedlongintseed{ 146791f9memcpy&session->randtbl,randtbl,sizeofsession->randtbl; 1468fc5b/*Initializerandom_rtobethesameasrandom.Thisishow 14693468*it'sdoneinmyversionoflibceglibc-2.11.1,Ubunu10.04*/ 14708712ifinitstate_rseed,char*session->randtbl,128,&session->rs{ 14713802perror"initstate_r"; 147279b3return1; 147348d0} 98 PAGE 109 147434e8return0; 1475c2b7} 1476da39 1477dffdunsignedintGA_randGA_session*session{ 147846d4intr; 1479fce7ifrandom_r&session->rs,&r!=0{ 14806badperror"random_r"; 1481ac84exit; 148248d0} 14830f1freturnunsignedintr; 1484c2b7} 1485da39 14866949doubleGA_rand_doubleGA_session*session{ 1487a299returndoubleGA_randsession/RAND_MAX; 1488c2b7} 1489da39 1490236b#endif File3: ga.h 15adc/**filega.h 25a96* 38475*Geneticalgorithmheaderfile. 4c4d2*/ 5c61d#ifndef_HAVE_GA_H 60b19#include PAGE 110 13010c5/**Alistofindexesintothepopulation,sortedbyfitness.The 1310eba*mostfitindividualisfirst.*/ 132728eunsignedint*sorted; 133a8c7/**Thegenerationofthepopulation.Theinitialrandompopulation 134122e*isgeneration0. 1358f07*/ 1367956unsignedintgeneration; 1376fed/**Theindexofthefittestindividualofthepopulation.*/ 138ac05unsignedintfittest; 139aee1/**Thesumofthefitnessoverallindividuals.Usedbythe 140a826*roulettealgorithm.*/ 141ddcfdoublefitnesssum; 1422f8e/**Thesizeofthefitnesscache.*/ 143837funsignedintcachesize; 144d930/**Dynamicmutationleadingfitness.*/ 14500e1doubledynmut_leading; 146ebca/**Dynamicmutationtrailingfitnessbuffer.*/ 14726c4double*dynmut_trailing; 148ba8f/**Dynamicmutationtrailingfitnessbufferindex.*/ 1497a7aunsignedintdynmut_trailing_pos; 150da39 151b3b8/**PointertotheGA_settingsstructureforthissession.*/ 1521aafGA_settings*settings; 1532111/**ArrayofGA_threadstructures,representingeachthread.*/ 1546ca2GA_thread*threads; 155f46f#ifTHREADS 156f4cc/**Mutextocontrolaccesstoworkerthreadfrommainthread.*/ 157cdccpthread_mutex_tinmutex; 1580cf2/**Conditionalvariabletosignalworkerthreadofnewinputdata.*/ 159d62dpthread_cond_tincond; 160e178/**Indexofpopulationelementtoevaluate.*/ 1614027intinindex; 162ba3d/**Flagtoindicateavailabilityofnewinputdata.*/ 1630c6eintinflag; 164da39 1657b66/**Mutextocontrolaccesstomainthreadfromworkerthread.*/ 166e0a6pthread_mutex_toutmutex; 1670030/**Conditionalvariabletosignalmainthreadofnewoutputdata.*/ 168f5d2pthread_cond_toutcond; 169d276/**Indexofpopulationelementthatwasevaluated.*/ 17024e9intoutindex; 171f32f/**ReturnvalueofGA_do_checkfitness.*/ 172821dintoutretval; 173d316/**Flagtoindicateavailabilityofnewoutputdata.*/ 174eb8cintoutflag; 175da39 176cf4e/**Mutextocontrolaccesstofitnesscache.*/ 17706bfpthread_mutex_tcachemutex; 178236b#endif 1797547#ifHAVE_GSL 18043a0/*Randomnumbergenerator.*/ 181229egsl_rng*r; 18222c7#else 183a3e5/**Randomnumbergeneratorstatevariable.*/ 184da01structrandom_datars; 185a3e5/**Randomnumbergeneratorstatevariable.*/ 18638deint32_trandtbl[32]; 187236b#endif 1885c75}GA_session; 189da39 19033e5/*Libraryfunctions*/ 191da39 1922563/**InitializeaGA_settingsobjectwiththedefaultsettings. 1935a96* 1949d71*paramsettingsTheGA_settingsobjecttoinitialize.Mustalready 195efcc*beallocated. 1965a96* 1971d02*returns0toindicatesuccess. 198c4d2*/ 19936a3intGA_defaultsettingsGA_settings*settings; 200da39 201e28f/**InitializetheGA_sessionobject.Thisfunctionallocatesthe 202571c*dynamicfieldsintheobjectandgeneratesarandompopulation. 2035a96* 2041713*paramsessionTheGA_sessionobjecttointialize. 2051b9c*paramsettingsTheGA_settingsobjectcontainingthesettings 206923b*forthenewsession. 2077763*paramsegmentcountThenumberofsegmentsineachindividual. 208c3eb*Notethatthebit-widthofeachsegmentissetbythe 209e6cb*compile-timedefinitionofGA_segment. 2105a96* 21124d7*returns0toindicatesuccess,1through7ifamemoryallocation 21208ba*failed,50ifaninvalidthreadcountisspecified,51ifanerror 2133a8c*occursstartingathread,55ifthethread_initfunctionfails,90 2148621*ifanyfitnessfunctionfailed. 215c4d2*/ 216e281intGA_initGA_session*session,GA_settings*settings, 2176b78unsignedintsegmentcount; 218da39 219fb61/**Freeallallocatedstructures. 2205a96* 2215046*paramsessionApreviouslyintializedGA_sessionobject. 2225a96* 2231d02*returns0toindicatesuccess. 224c4d2*/ 225c7d0intGA_cleanupGA_session*session; 226da39 2279f72/**Evolvethepopulationbyanumberofgenerations. 2285a96* 229637f*paramsession 230cde2*ApreviouslyintializedGA_sessionobject. 2315ea7*paramgenerations 2321a43*Thenumberofgenerationstorun,0touseGA_settings.generations. 2335a96* 2346d98*returns0toindicatesuccess,1ifanyfitnessfunctionfailed. 235c4d2*/ 236267eintGA_evolveGA_session*session,unsignedintgenerations; 237da39 238cb23/**Generateindividualsintothepopulationstartingfromindexi. 2395a96* 2404e9c*Usedtogenerateinitialpopulationofeachgenerationskippingelitism 24162d5*andforreplacingrejectedmembersofthepopulationskippingvalidones. 2425a96* 2435046*paramsessionApreviouslyintializedGA_sessionobject. 24422c3*paramiTheindextostartfrom. 245c4d2*/ 24638afvoidGA_generateGA_session*session,unsignedinti; 247da39 248e072/**Chooseanelementoftheoldpopusingtheroulettealgorithm. 2495a96* 2505046*paramsessionApreviouslyintializedGA_sessionobject. 2515a96* 2523203*returnsAnindexintothepopulation. 253c4d2*/ 2548d1eunsignedintGA_rouletteGA_session*session; 255da39 256c7df/**Comparisonfunctionforsortingindividualsbyfitness. 2575a96* 2589388*seeqsort 259c4d2*/ 26037f7intGA_comparatorconstvoid*a,constvoid*b; 261da39 262284d/**Checkthefitnessofallindividuals.Updates 263102a*GA_individual.fitness,GA_session.fittestGA_session.fitnesssum. 2645a96* 2655046*paramsessionApreviouslyintializedGA_sessionobject. 2665a96* 267d559*returns0toindicatesuccess,nonzeroifanyfitnessevaluation 2684998*returnedanerror. 269c4d2*/ 270758fintGA_checkfitnessGA_session*session; 271da39 2729e2b/**nameProblem-specificFunctions 2735a96* 27419e0*Thesefunctionsmustbeimplementedspecificallyforeachproblem. 2755a96* 2762bcf*todoConsiderreplacingwithfunctionpointersinGA_settings. 2775a96* 278698f*{*/ 279da39 280c47c/**Determinethefitnessofthegivenindividual.Theimplementation 281b5c2*ofthisfunctionshouldsetGA_individual.fitnesstoa 282093d*double-precisionfloating-pointfitnessvalue,whichwillbe 2835e7a*maximizedbythegeneticalgorithm. 100 PAGE 111 2845a96* 28567da*returns0forsuccess,nonzeroforfailure. 286c4d2*/ 287603dexternintGA_fitnessconstGA_session*ga,void*thbuf,GA_individual*elem; 288da39 28983c3/**Checkifaterminationconditionhasbeenreached. 2905a96* 29140ef*returns0iftheevolutionshouldcontinue,nonzeroifthe 2927a4a*evolutionshouldterminate. 293c4d2*/ 29408ceexternintGA_terminationconstGA_session*ga; 295da39 296e140/**Initializeproblem-specificthreadstate. 2975a96* 298925d*paramthreadTheGA_threadobjectforthethread. 2995a96* 30067da*returns0forsuccess,nonzeroforfailure. 301c4d2*/ 302854dexternintGA_thread_initGA_thread*thread; 303da39 3048758/**Freeproblem-specificthreadstate. 3055a96* 306925d*paramthreadTheGA_threadobjectforthethread. 3075a96* 3080948*returns0forsuccess. 309c4d2*/ 310368cexternintGA_thread_freeGA_thread*thread; 311da39 312c7a3/**Quicklycheckthefitnessofthegivenindividualagainstbasic 31353dd*constraints.Ifthisfunctionreturnsazerovalueunfit,the 3140ff5*individualwillberejectedandanewonegeneratedinitsplace. 3155a96* 3164304*returns0forunfit,nonzeroforfit. 317c4d2*/ 3187f2eexternintGA_fitness_quickconstGA_session*ga,GA_individual*elem; 319da39 3208a77/**Generatearandomnumber. 3215a96* 322db5d*returns0forsuccess,nonzeroforerror. 323c4d2*/ 32494baexternGA_segmentGA_random_segmentGA_session*ga,constunsignedinti, 32589d3constunsignedintj; 326da39 32769df/**Tasktocompleteaftereachgeneration. 3284bca*savestate,outputprogress,etc. 3295a96* 330db5d*returns0forsuccess,nonzeroforerror. 331c4d2*/ 332d3cdexternintGA_finished_generationconstGA_session*ga,intterminating; 333da39 334cf15/**Tasktocompletebeforeeachgeneration. 335872c*updateoptionsthatchangeper-generation 33643e5*Indistributedmode,executeonbothserverandclients. 3375a96* 338db5d*returns0forsuccess,nonzeroforerror. 339c4d2*/ 34048f8externintGA_starting_generationGA_session*ga; 341da39 342930e/*}*/ 343da39 344d6a0/*Graycodehelperfunctions*/ 345da39 346596f/**HelperfunctiontoconvertfromGraycodetobinary. 3475a96* 3481ce9*seehttp://en.wikipedia.org/wiki/Gray_code 349c4d2*/ 3504c20GA_segmentgraydecodeGA_segmentgray; 351da39 35232d6/**HelperfunctiontoconvertfrombinarytoGraycode. 3535a96* 3541ce9*seehttp://en.wikipedia.org/wiki/Gray_code 355c4d2*/ 356bb81GA_segmentgrayencodeGA_segmentbin; 357da39 358b115/**Generatearandomintegerusingthe/dev/urandomspecialdevice. 3595a96* 360afd0*noteThe/dev/urandomspecialdeviceisdesignedforsecurity,not 3618028*speed,sothisfunctionshouldnotbeusedforlargeamountsof 3624d04*randomdata. 3635a96* 364e178*see PAGE 112 438da39 439e480/**Generatearandomnumberusingthesession'srandomnumbergenerator. 4403c17*Double-precisionfloating-pointin[0,1.Notthread-safe. 441c4d2*/ 4426ed2doubleGA_rand_doubleGA_session*session; 443da39 444f46f#ifTHREADS 4454892/*Considerabandoningthismutexinfavorofflockfileon*/ 4469467externpthread_mutex_tGA_iomutex;/*=PTHREAD_MUTEX_INITIALIZER;*/ 447236b#endif 448da39 4499f42#define_HAVE_GA_H 450236b#endif File4: ga-clientonly.h 148cc#ifndef_HAVE_GA_CLIENTONLY_H 20b19#include PAGE 113 663df9PATH=.ga-spectroscopy--help>doc/ga-spectroscopy.txt 67d408doxygenDoxyfile 68bcbacp-pdoc/mydoxygen.stydoc/latex 69b919$MAKE-Cdoc/latexpdf 70da39 71e6aa.PHONY:allcleandoc File6: gaspecdist.pl 104c4#!/usr/bin/perl 2d08f# 320bf#gaspecdist.pl-DistributedComputingDispatcherforga-spectroscopy 4d08f# 581d8#Receivesgenerationsfromga-spectroscopyprocesses,packagestheminto 629bc#workunits,andsendsworkunitstotheServer. 7d08f# 8851euseIO::Socket::INET; 97e7duseIO::Select; 107c8euseDigest::MD5; 11f00buseFile::Spec; 1287acuseFile::Temp; 13e339useJSON; 14dfe6useTime::HiRes; 15ce53useFile::Copy; 16fa32useURI; 179d9busewarnings; 188428usestrict; 19da39 20545d#Distributedcomputingserver.Overriddenbyserver.conf,ifpresent. 2116b6my$SHOST,$SPORT='localhost',9933; 222ec7#Hostnameandportnumbertolistenonforconnectionsfromga-spectroscopy. 23021cmy$LHOST,$LPORT='localhost',2222; 24ff68#HostnameandportnumberforHTTPserverforclientstoloaddatafiles. 25eac0#IfURLsdonotincludeahostname,thevalueof$WHOSTmayhavenoeffect. 269552my$WHOST,$WPORT='localhost',9990; 27da39 28d33a#Loadsomeserverinformationfromserver.conf 29740fopenF,'<','server.conf'ordie"Cannotloadserver.conf:$!"; 306e68my$serverconf=from_jsonjoin'', PAGE 114 141294bremaining=>0,gentime=>0, 142276dlastgentime=>0,genstart=>Time::HiRes::time, 1435d0cfiles=>[[get_app]],workerspeed=>{}}; 1443bdb$select->add$client; 145d384} 14653b3else{ 147d132my$id=fileno$sock; 148d1d5unlessexists$socks{$id}{ 1498a8b#Trynottocrash 15035b4warn"Readfromphantomsocket$sock"; 1514a6a$select->remove$sock; 1524d5cnext; 153c050} 154d7f5my$rc=sysread$sock,$socks{$id}{buf},512, 1556c2alength$socks{$id}{buf}; 156e82cif!$rc{ 1574a37warn"sysreadfromsockID=$idfailed:$!"if!defined$rc; 15831b2#If$rcisdefined,thenwehaveEOF 1594a6a$select->remove$sock; 1608a36#TODO:Canceljobs,removefrom@items,etc. 1614d5cnext; 162c050} 163c06b#Handledatafromsocket 164b585HandleSocket$id; 165d384} 1662a9a} 167e266} 168c2b7} 169da39 1706148#Handleareturnedworkunit 1711fc8subWorkReturned{ 172239dmy$sock,$command,$reply=@_; 173282c#FIXME:Dosomesanitychecks 1741984my$source=-1; 175035emy$gasock=undef; 1763ce2my$receivedtime=Time::HiRes::time; 177fe98my$fatal=0; 178576amy$nreturned=0; 179de51my$failed=$commandeq'FAILED'; 180a665if$reply->{files}&&@{$reply->{files}}{ 181ea1fmy$valid,undef,$fn=@{$reply->{files}[0]}; 1821fa5my$checksum,$buf; 1836fabif$fn=~m/^data:/{ 1843b49my$u=URI->new$fn; 1853bd2$buf=$u->data; 1862a9a} 1875834else{ 18821b1$fn="$INBOXDIR/$fn"; 1890a71$buf=''; 190b6bcifopenF,'<',$fn{ 191be8abinmodeF; 1921ffe$buf.=$_while PAGE 115 2953584subget_app{ 2967fd6my$appfile="$APPDIR/$APPNAME"; 297b66acopy$appfile,"$TEMPDIR/$APPNAME" 298de2borwarn"Can'tcopy$appfiletotemp:$!"; 299f8b9returnmd5$appfile||'',$APPURL,$APPNAME; 300c2b7} 301da39 30234e5subRemoveFileDependency{ 303bfc3#Removeafilerequirementforagivenjob 3041f61my$id,$fn=@_; 305526emy$n=@{$socks{$id}{files}}; 3065299formy$i=0;$i<$n;$i++{ 3077d0dif$socks{$id}{files}[$i][2]eq"$REMOTETEMPDIR/$fn"{ 308b10e#Splicethiselementfromthearrayandreturn 3091e3dsplice@{$socks{$id}{files}},$i,1; 3109a5breturn; 3112a9a} 312e266} 313ad24warn"Couldnotremove$fnfromdependenciesofsock$id"; 314c2b7} 315da39 3169664subHandleSocket{ 317dc50my$id=@_; 3188c07while$socks{$id}{buf}=~s/^.*?r*n//{ 31951eemy$l=$1; 32033a4print"G$id:$ln"unless$l=~m/^I/; 321f26cif$l=~m/^V.+$/{ 3228b0bmy$configdata=$socks{$id}{config}."$ln"; 323bff8#Saveconfigtofile 324b8c6my$fh; 325c704if$socks{$id}{configfile}{ 3263198#Removetheexistingconfigfilefromthelistofdependencies 32792a1my$fn=MakeRelPath$socks{$id}{configfile},$TEMPDIR; 328a559RemoveFileDependency$id,$fn; 329be87#Opentheexistingconfigfile 3307568open$fh,'>',$socks{$id}{configfile} 3319a6borwarn"Cannotwriteto$socks{$id}{configfile}"; 332d384} 33353b3else{ 3349883#Createandopenanewconfigfile 3353c48$fh,$socks{$id}{configfile}= 33656d9File::Temp::tempfile"$DISPATCHID-config-XXXXX", 3378b95DIR=>$TEMPDIR,OPEN=>1, 338e98fEXLOCK=>0,UNLINK=>0; 339d384} 3405f13print$fh$configdata; 341b6a6close$fh; 34208a7my$checksum=Digest::MD5::md5_hex$configdata; 3439955my$fn=MakeRelPath$socks{$id}{configfile},$TEMPDIR; 344f3b2push@{$socks{$id}{files}}, 3455836[$checksum,"$TEMPURL/$fn","$REMOTETEMPDIR/$fn"]; 3462a9a} 347c585elsif$l=~m/^Gd+/{ 348e9a7$socks{$id}{gen}=$1; 3492874$socks{$id}{wucount}='aaa'; 3500f06#ResetthegenerationtimershereinordertosupportDR 351eb71#whichwouldhaveproblemsiftheseweretoresetonDISPATCH. 352b696$socks{$id}{lastgentime}=$socks{$id}{gentime}; 3531b45$socks{$id}{genstart}=Time::HiRes::time; 354b986openF,'>>','/tmp/workerspeed.log'; 3551855printF"GENTIME",$socks{$id}{gen}-1,"$socks{$id}{lastgentime}n"; 356dedfcloseF; 3572a9a} 3586ff2elsif$l=~m/^Id+.+$/{ 3598ee7my$origindex,$values=$1,$2; 360591f$socks{$id}{remaining}++; 361a786AppendItem{source=>$id,origindex=>$origindex,workunit=>'', 3629062values=>$values,fitness=>undef}; 3632a9a} 364ab47elsif$l=~m/^DISPATCH/{ 3652734#Preparethisgenerationtobesent 3661c97my$founditems=0; 367158bformy$i=0;$i<@items;$i++{ 3681defmy$item=$items[$i]; 3691d04ifdefined$itemand$item->{source}==$id{ 370b1d8$item->{sent}=0; 3719804$founditems=1; 3724d32$haveworkcache=$iif$i<$haveworkcache; 373c050} 374d384} 375f766#ReceivedemptypopulationentirepopulationwasinGAcache 376393cif!$founditems{ 3770234my$gasock=$socks{$id}{sock}; 3783baeprint$gasock"DONEn"; 379d384} 38053b3else{ 3811e4eTrySending; 382d384} 3832a9a} 3847979elsif$l=~m/^CFG[A-Z0-9]S+?:.+?/{ 38599acmy$type,$opt,$val=$1,$2,$3; 38655e1my@suffixes=; 387873b#Someconfigurationlineswillspecifydependencyfiles. 3886895if$opt=~m/^match|template|drfile$/&&defined$val{ 3899781#.INT+.VARfiles 3908a56@suffixes=$opteq'template'?'.int','.var':''; 391d384} 392c867#Removeexistingconfigurationitemsanddependencyfiles. 3933173#Note:Bydefault,perlre's$willignorenewlinesattheendof 394abd4#thestring. 395b9ee#DoNOTdothisforamin,amax,...,delkmin,delkmax,asthose 3966da0#optionscanbespecifiedmultipletimes. 3975f5cif$opt!~m/min|max$/&&$socks{$id}{config}=~ 3987d43s/^|nCFG[A-Z0-9]$opt[^n]*?n|$/$1/&& 3990a86@suffixes&&$3{ 4002f03#Removeobsoletedependencyfiles 40154a8myundef,undef,$oldfile=File::Spec->splitpath$3; 4027f3dmy$l=$socks{$id}{files}; 4034000foreachmy$suffix@suffixes{ 404f9admy$thisfile="$REMOTEDATADIR/$oldfile$suffix"; 40550bdmy$removed=0; 40613a2formy$i=0;$i<@$l;$i++{ 4071b5bif$l->[$i][2]eq$thisfile 4089d95{splice@$l,$i,1;$removed++} 4095717} 410a2ccwarn"Remove$removeddependenciesfor$thisfile." 411ee4dif$removed!=1; 412c050} 413d384} 414d9b7#Ifthisconfigurationlinespecifiesafile,locatethefile 415b04f#andaddittoourlistofdependencies. 416b9e4if@suffixes{ 4178683#Determineourdestinationfilename 4187d74myundef,undef,$outfile=File::Spec->splitpath$val; 4194c87$outfile="$REMOTEDATADIR/$outfile"; 420af60#Whereisthisdatafilereally? 4214000foreachmy$suffix@suffixes{ 422defcmy$realfn=File::Spec->rel2abs$val.$suffix,$SPECDIR; 4233b1dmy$wwwfn=MakeRelPath$realfn,$DATADIR; 42445bfmy$checksum=md5$realfn; 4258c11if!$checksumor!-e$realfn{ 42600edwarn"Errorchecksumming$val=>$realfnor". 4278e1c"filedoesnotexistID=$idconfig:$!"; 428e818next; 4295717} 4302cc6if$wwwfn=~m/^../{ 43158cdwarn"Datafile$valisnotin$DATADIR"; 432e818next; 4335717} 4347bb8push@{$socks{$id}{files}}, 435e6bf[$checksum,"$DATAURL/$wwwfn","$outfile$suffix"]; 436c050} 437ffe2$val=$outfile; 438d384} 4391cc6$socks{$id}{config}.="$type$opt". 440d17ddefined$val?"$valn":"n"; 4412a9a} 442e266} 443c2b7} 444da39 445fdf7subMakeRelPath{ 446e14cmy$fn,$dir=@_; 447d493$fn=File::Spec->rel2abs$fn 448c74aunlessFile::Spec->file_name_is_absolute$fn; 105 PAGE 116 449a74ereturnFile::Spec->abs2rel$fn,$dir; 450c2b7} 451da39 452ec70#Addanitemtotheglobalitemlist 453ee02subAppendItem{ 454a80cmy$item=@_; 455d47b$item->{sent}=-1;$item->{received}=0;#$item->{batch}=-1; 456ddbawhile@freeitems{ 457486cmy$i=pop@freeitems; 458a805die"Freeitemslistisbroken"if$items[$i]; 4597af8$items[$i]=$item; 46040a3return; 461e266} 4621deepush@items,$item; 463c2b7} 464da39 465c5c3#Getthenextindividualwecansend 466ad37subGetNextIndividual{ 4673ee7my$client,$from=@_; 468d987#Mostcommoncase:client=-1findanyworkfromanyGA.Ifthisfailed 469d610#lasttime,returnacachedresponseuntilwegetmorework. 470e8damy$i=$from+1; 4712249#Theworkcachekeepsthisfunctionfastbytrackingthestartingpoint 472f0eb#andtrackingtheexistanceofavailablework. 473a632$i=$haveworkcacheif$client<0; 4748afefor;$i<@items;$i++{ 4755f88if$items[$i]&&$items[$i]{sent}==$items[$i]{received}&& 476d30f$items[$i]{sent}==0{ 477f6a6#Thisindividualisreadytosend. 47813e2if$client>=0&&$items[$i]{source}!=$client{ 479fc76#Butit'snotfromtherightclient.Keepgoing. 480640f$haveworkcache=$iif$i<$haveworkcache; 481a286next; 482d384} 483896a#Updatethecache,andreturnthisindividual. 48400f2#Leavetheworkcachepointerhereincaseweignorethereturn 48562c1#valueonlycheckingifworkexistsandnotcollectingityet. 486b495$haveworkcache=$iif$i<$haveworkcache; 48767c3return$i; 488f584} 4897a04#Ifwe'vepassedthecachepointandtheindividualwasunsendable, 49027c2#bringthecachepointalonguntilwefindasendableindividual. 491f97c$haveworkcache++if$haveworkcache==$i; 492e266} 493b4bfreturn-1; 494c2b7} 495da39 496fd97#Sendworktoaworker 497bb4dsubSendWork{ 4985a4amy$worker=@_; 499da39 5003d3f#Findanindividualwecandispatch 501e2f0my$i=GetNextIndividual-1,-1; 5025c97returnif$i<0;#Noworkavailable 503c095my$client=$items[$i]->{source}; 504da39 505463a#WorkerspeedvariesbasedonGArunsettings 50606a1my$ws=$socks{$client}{workerspeed}{$worker->{id}}||[]; 5077e93#my$duration=$WUDURATION+int*rand; 508fe07my$remaining=$socks{$client}{lastgentime}? 509f339$socks{$client}{lastgentime}510b412Time::HiRes::time-$socks{$client}{genstart}:15; 511aab3my$duration=.5*$remaining; 5122fb9$duration=2if$duration<2; 513b0f9$duration=$WUDURATIONif$duration>$WUDURATION; 514d90bmy$n=0; 515ebc6if@$ws>=4{ 5161b7f#Determinetheaveragespeedoftheworker. 5176647my$mean=0; 518a24e$mean+=$_foreach@$ws; 5190ad1$mean/=@$ws; 520749cmy$denom=0; 52126a6foreach@$ws{ 522ae06#Weightexcessivelyslowworkunitsmore,excessivelyfast 523cdbc#workunitsless. 524cc26my$w=$_>1.1*$mean?.707***$_/$mean: 525c464$_<.9*$mean?2***$_/$mean:1; 5262c71$n+=$w*$_; 527e0ae$denom+=$w; 5282a9a} 5293b3f$n=$n/$denom*$duration; 530f4d5$n=$n<1?1:floor$n; 531e266} 532fb9belsif@$ws==1{$n=$ws->[0]*$duration} 5335e39else{$n=10}#Justsend10workunitsuntilwefigureoutthespeed. 534da39 53543b0#GenerateworkunitID 5363fc8my$wuid=sprintf"%s-%04d-%04d%s",$DISPATCHID,$socks{$client}{uniqid}, 53721a4$socks{$client}{gen},$socks{$client}{wucount}; 538a723$socks{$client}{wucount}++; 539da39 5409962#Generateourworkunitpopulation 5410a5cmy$popfile="G$socks{$client}{gen}n"; 542390fmy$c=0;#my@wuitems=; 5438237do{ 5441131lastif$i<0; 54516aa$items[$i]{sent}++; 5460a55$items[$i]{workunit}=$wuid;#FIXME:Iteminmultipleworkunits? 547e46f$c++;#push@wuitems,$i; 5486730$popfile.="I$i$items[$i]{values}n"; 5493ac2}while$c<$nand$i=GetNextIndividual$client,$i>-1; 550da39 5514af2my$checksum=Digest::MD5::md5_hex$popfile; 55263ecmy$popurl,$popfn; 553d3feif0{ 55480e9#Savepopulationtofile 5557dfemy$fh,$popfn= 55635edFile::Temp::tempfile"$DISPATCHID-pop-XXXXX",DIR=>$TEMPDIR, 5578470OPEN=>1,EXLOCK=>0,UNLINK=>0; 558b7baprint$fh$popfile; 559941aclose$fh; 560864b$popfn=MakeRelPath$popfn,$TEMPDIR; 561bd38$popurl="$REMOTETEMPDIR/$popfn"; 562e266} 563097celse{ 5641009#Encodepopulationtoadata:URI 5650e51my$u=URI->new'data:'; 566185a$u->data$popfile; 5677a5f$popurl=$u.''; 568510a$popfn="$DISPATCHID-pop-$checksum"; 569e266} 570da39 571ab66#Preparetodispatch 5727036$workunits{$wuid}={ 5730284id=>$wuid,worker=>$worker->{id},nitems=>$c, 574f5faupload=>$UPLOADURL, 5753544files=>[@{$socks{$client}{files}}, 576b792[$checksum,$popurl,"$REMOTETEMPDIR/$popfn"]], 577a273sent=>Time::HiRes::time,duration=>$duration, 578fdcb}; 5793e60print$distsock'DISPATCH',to_json$workunits{$wuid},"n"; 5801400#Telltheserverthatwewillissuenomoreworkintheimmediatefuture. 581568fprint$distsock"NOMOREWORKn"ifGetNextIndividual-1,-1<0; 582c2b7} 583da39 5840a16#Letustrytosendsomeitemsout 5858f3asubTrySending{ 5860a7ereturnifGetNextIndividual-1,-1<0; 587cb86print$distsock"HAVEWORKn"unless$askingforworkers; 588a8cd$askingforworkers=1; 589c2b7} 590da39 591e927submd5{ 592310fmy$fn=@_; 593493aopenF,'<',$fnorreturnundef; 594b9cbbinmodeF; 595b778my$md5=Digest::MD5->new->addfile*F->hexdigest; 596fd05closeF; 5978bccreturn$md5; 598c2b7} 599da39 60010a8subcat{ 601310fmy$fn=@_; 602fa3amy$buf=''; 106 PAGE 117 603493aopenF,'<',$fnorreturnundef; 604b9cbbinmodeF; 605506e1whilesysreadF,$buf,512,length$buf; 606fd05closeF; 607c01creturn$buf; 608c2b7} 609da39 6101dcd1; File7: distserver.pl 104c4#!/usr/bin/perl 2d08f# 3b736#distserver.pl-DistributedComputingServer 4d08f# 5e9bfusethreads; 6851euseIO::Socket::INET; 77e7duseIO::Select; 8dfe6useTime::HiRes; 97c8euseDigest::MD5; 10e339useJSON; 119d9busewarnings; 128428usestrict; 13da39 14b19e#Configuration 158efemy$VERSION=20120516; 16349dmy$PORT=9933; 17da39 186e1d#Minimumclientversion;olderclientswillbejailed.Theywillnotbesent 19fe82#anywork,andclientsversion>=20120210willdisplayanerrormessage. 20518emy$MINCLIENT=20120516; 21da39 225b4e#Deadlinefactors 231ed1my$LAZYDEADLINE,$AGRESSIVEDEADLINE=,1.5; 244a3f#Howlongdowewaitforadistributortoprovidenewworkafterbeingtold 2574e6#ofanavailableworker? 269ef3my$DISTTIMEOUT=3; 276113#Howfrequentlytooutputthelistofworkers 286226my$DUMPINTERVAL=60; 295eaa#Howfrequentlytocheckforcompletedworkunits 30c592my$PROBEINTERVAL=1; 310b79#Filecontainingworksubmissionpassword 329e8amy$KEYFILE='distserver.key'; 33da39 343b37#Createlistenersocket 35e9d0my$listener=IO::Socket::INET->new 36c291LocalAddr=>'0.0.0.0',LocalPort=>$PORT,Listen=>5,Reuse=>1 3781a9ordie"Can'tlistenon$PORT:$!"; 389d69my$select=IO::Select->new$listener; 39d9cblocal$SIG{'PIPE'}='IGNORE'; 40da39 41fbb9#Loadpasswordforjobsubmission 422325my$JOBSENDERTOKEN=Digest::MD5::md5_hex'DISTJOBS',$VERSION,$$,rand; 43fa06ifopenF,'<',$KEYFILE{$JOBSENDERTOKEN= PAGE 118 139160c} 1405717} 141bf63if$wc{ 1429b17my$str=to_json{id=>$wc->{id},name=>$wc->{name}}; 143694eprint$sock"WORKER$strn"; 1445717} 1450b3eelse{print$sock"NOWORKERSn"}#Shouldneverhappen 14652a2#Somebodyisusingworkers,soresetthetimeout. 1471634$freeworkersince=timeif$busythreads<$totalthreads; 148c050} 149c6e9elsif$l=~m/^NOMOREWORK/{ 150bc4a$idledistributors++unless$clients{$id}{idle}; 151be02$clients{$id}{idle}=1; 152c050} 153d5e4elsif$l=~m/^DISPATCH.+$/{ 1543522my$json=$1; 155ada5my$obj=eval{from_json$json}||undef; 1569dfc$obj={}ifref$objne'HASH'; 1572149my$workerid=$obj->{worker}; 1589829my$workersock=$workers{$workerid}{sock}? 1594e02$clients{$workers{$workerid}{sock}}{sock}:0; 16069bamy$err=''; 16129c8#Workunitsanitychecks 1621575if!$obj->{id}{$err.='NoworkunitID.'} 163feb6elsif!$obj->{duration}||$obj->{duration}<=0 164b603{$err.='Noduration.'} 16599fb#FIXME:CheckvalidityofIDgaspec-something 1661e0belsif!$workerid{$err.='Noworker.'} 1673955elsif!exists$workers{$workerid} 168eceb{$err.='Invalidworker.'} 16986f1elsif$workers{$workerid}{assigned}||0>= 170f29f$workers{$workerid}{threads}||0 1717501{$err.='Workerisbusy.'} 1721147elsif!$workersock{ 1735c60DeleteWorker$workerid; 174da52$err.='Workerisoffline.' 1755717} 176e7c0if$err{ 177f132$json=to_json{id=>$obj->{id}||undef, 1789ac8error=>$err}; 1793f5aprint$sock"WORKREJECTED$jsonn"; 180e818next; 1815717} 18282d4#Sendtheworktotheclient. 1830eb4print$sock"WORKACCEPTED$obj->{id}n"; 184dacbprint$workersock"WORK$jsonn"; 1857a0e$workers{$workerid}{assigned}++; 1866702$busythreads++; 1878ca2$freeworkersince=0if$busythreads>=$totalthreads; 188dc92#Savetheworkunitinthehash 189331emy$now=Time::HiRes::time; 1909fa2$workunits{$obj->{id}}= 1911b95{obj=>$obj,source=>$id,worker=>$workerid, 1922f2astarttime=>$now}; 1931136#Insertworkunitinto@workunits 194fb14my$deadline=$now+$LAZYDEADLINE*$obj->{duration}; 195b34dformy$i=0;$i<=@workunits;$i++{ 196249eif$i==@workunits{ 1979cbapush@workunits,$obj->{id}; 198f080last; 199160c} 200fceamy$tw=$workunits{$workunits[$i]}; 20194b3my$twd=$tw->{starttime}+ 202da2d$LAZYDEADLINE*$tw->{obj}{duration}; 203628dif$deadline<$twd{ 2049260splice@workunits,$i,0,$obj->{id}; 205f080last; 206160c} 2075717} 20889b4NotifyMonitors'CREATE','WORKUNIT',$obj->{id}; 2098708#MonitordoesnotneedUPDATEWORKERmessagealso. 210c050} 2113883else{print$sock"ERRInvalidcommandn"} 212d384} 213ff77elsif$clients{$id}{kind}==1{#Worker 214926bmy$w=$workers{$clients{$id}{worker}}; 2159b76#Trackourlast-seen-timefortheworker 2164c3e$w->{seen}=time; 2174867if$l=~m/^WORKACCEPTED/{} 218fe11elsif$l=~m/^WORKFINISHED|FAILED.+$/{ 219cea3my$state,$json=$1,$2; 220ada5my$obj=eval{from_json$json}||undef; 221f83emy$error=''; 222a68cifref$objne'HASH'{$error='InvalidJSON'} 2238677elsif!$obj->{id}or!exists$workunits{$obj->{id}} 22438a6{$error='Workunitdoesnotexist'} 22557f8elsif$workunits{$obj->{id}}{worker}ne$clients{$id}{worker} 226deca{$error='Workunitnotassignedtoyou'} 2271d95if$error{ 2286910print$sock"ERR$errornQUERYWORKn"; 2295717} 23088deelse{ 231826aDeleteWorkunit$obj->{id},$state,$json; 2321ed0#Updatelast-successfully-completed-workunittime 233a5b7$w->{seenwork}=timeif$stateeq'FINISHED'; 2345717} 235c050} 2362d86elsif$l=~m/^WORKING.+$/{ 237c8b3my$json=$1; 238ada5my$obj=eval{from_json$json}||undef; 23924b2$obj=[]ifref$objne'ARRAY'; 2401832my%found=; 241ce67foreach@$obj{ 242c258nextunlessexists$workunits{$_}; 2437e2enextunless$workunits{$_}{worker}eq$clients{$id}{worker}; 244de99#Thisisaworkunitwehaveassignedtothisworker. 245e489$found{$_}=1; 2465717} 247cccd#Istheworkeroperatingonanunexpectednumberof 2484481#workunits? 2498a21my$n=scalar@$obj; 250bfa0if$w->{assigned}!=$n{ 25133acprint"Workertaskcountupdate$w->{assigned}->$nn"; 252adc9$busythreads+=$n-$w->{assigned}; 253cbb0$busythreads=0if$busythreads<0; 2546aabif$busythreads>=$totalthreads 255a8c7{$freeworkersince=0} 256ea85elsif!$freeworkersince{$freeworkersince=time} 25789a3NewWorkerif$n<$w->{assigned}; 25851f3$w->{assigned}=$n; 259d2b1NotifyMonitors'UPDATE','WORKER',$w->{id}; 2605717} 2615380#Cancelalltheworkunitstheclient"forgot"about. 262f4adforeachkeys%workunits{ 2637e2enextunless$workunits{$_}{worker}eq$clients{$id}{worker}; 2640deenextifexists$found{$_}; 2653ea1my$json=to_json 2668c4b{id=>$_,error=>'Clientforgotaboutme'}; 2674671DeleteWorkunit$_,'FAILED',$json; 2685717} 269c050} 270938aelsif$l=~m/^THREADSd+$/{ 271f22dif!$w->{jailed}{ 272f8e9#Donotallowjailedclientstochangenumberof 2732541#threads.Thiswillkeepusfromsendingthemwork. 274cd4cmy$t=$1; 275ae2e$totalthreads+=$t-$w->{threads}; 276611d$totalthreads=0if$totalthreads<0; 27726ceif$t>$w->{threads}{ 278b967$freeworkersince=timeunless$freeworkersince; 27992b2NewWorker; 280160c} 281b8daelse{#Reducingnumberofthreads. 28292e4$freeworkersince=0 283f16aif$busythreads>=$totalthreads; 284160c} 2855c71$w->{threads}=$t; 286d2b1NotifyMonitors'UPDATE','WORKER',$w->{id}; 2875717} 288c050} 28942fbelsif$l=~m/^PING/{print$sock"PONGn"} 290d954elsif$l=~m/^PONG/{} 2913883else{print$sock"ERRInvalidcommandn"} 292d384} 108 PAGE 119 29353b3else{ 2941bd3#Processmessagefromclient 2958ed8if$l=~m/^HELLOS+S+S+$/i{ 296d3e4my$sockver,$name,$ident=$1,$2,$3; 297b644$name=~tr/-_.0-9a-zA-Z//cd; 298b446#HellomessagesshouldhaveVersionandID/Permissions. 2999078if$identeq$JOBSENDERTOKEN{ 3005d44#FIXME:Checknameforcollisions 301790dprint$sock"OKYoucannowsubmitworkn"; 3028634$clients{$id}{kind}=2; 3036d07$clients{$id}{idle}=1; 3048880$totaldistributors++; 305d140$idledistributors++; 3065717} 307b1a9elsif$sockvereq'0'{ 3087de2print$sock"ERRInvalidkeyn"; 3095717} 31088deelse{ 3117912ifexists$workers{$ident}{ 31237a7DeleteWorker$ident; 313160c} 3144e5a$clients{$id}{kind}=1; 3159218$clients{$id}{worker}=$ident; 3162dd6$workers{$ident}= 317634d{sock=>$id,id=>$ident,name=>$name, 3188b4dthreads=>0,assigned=>0,seen=>time, 3199702seenwork=>0,jailed=>0,ver=>$sockver}; 32027d9if$sockver<$MINCLIENT{ 321d3faprint$sock"GOAWAY$goaway_messagen"; 322c332print"Client$namehasversion$sockver;tooold!n"; 323b5ef$workers{$ident}{jailed}=1; 324160c} 32539b2else{ 3269bee#Workerclient. 327b1c4print$sock"GRACEFACTOR$AGRESSIVEDEADLINEn"; 3288330#Istheclientworkingonanyjobsrightnow? 329cde7print$sock"QUERYWORKn"; 3307d27#ClientwillrespondtocommandsbeforegettingOK 331b991#andthiswaywillavoidracewherewesend 332cf3a#QUERYWORKandthenaworkunitbeforereceiving 333813e#emptyQUERYWORKresponse. 334a380print$sock"OKIwillnowsendyouworkn"; 33592b2NewWorker; 336160c} 337c89dNotifyMonitors'CREATE','WORKER',$ident; 3385717} 339c050} 340e177elsif$l=~m/^MONITOR/{ 34115a5push@monitors,$sock; 34273d1$clients{$id}{kind}=3; 343dac5foreachvalues%workers{ 3442d4anextunlessexists$_->{name}anddefined$_->{name}; 3453fa1nextunlessexists$_->{sock}anddefined$_->{sock}; 3461b08print$sock"UPDATEWORKER$_->{id}",to_json$_,"n"; 3475717} 3486931foreachvalues%workunits{ 3491194nextunlessexists$_->{obj}anddefined$_->{obj}; 3500625nextunlessexists$_->{worker}anddefined$_->{worker}; 351c4adprint$sock"UPDATEWORKUNIT$_->{obj}{id}", 352d3beto_jsonExportWorkunit$_->{obj}{id},"n"; 3535717} 354c050} 355db3delse{ 3567977print$sock"ERRLoginfirstn"; 357c050} 358d384} 3592a9a} 360e266} 36137ca#Probeworkunitsincasetheyaregone 3622be9my$hrnow=Time::HiRes::time; 3632528my$now=time; 36416d9#Iftherearefreeworkersandwe'vetoldeverybody,butnonehas 36539c9#providedwork,trytomakeworkbyagressivelycancelingoverdue 36623bb#workunits.FIXME:Allowdistributortospeedthisupbyreporting 36738b7#"noworkavailable." 3681e66my$agressive=$freeworkersince&&$now>$lastagressive+$DISTTIMEOUT&& 369cd05$totaldistributors<=$idledistributors|| 370edd6$now>$freeworkersince+$DISTTIMEOUT*2; 3718a14my$deadlinefactor=$agressive?$AGRESSIVEDEADLINE:$LAZYDEADLINE; 3720626my$i=0; 373e5f8whilemy$k=$workunits[$i]{ 3747065my$deadline=$workunits{$k}{obj}{duration}*$deadlinefactor; 3757ca5if$workunits{$k}{starttime}+$deadline>$hrnow 376b18a{$i++,nextif$agressive;last} 377b681#Workunitisrunningtooslowly.Canceltheworkunit. 37802b2print"Failing$k,",$hrnow-$workunits{$k}{starttime}379de53$workunits{$k}{obj}{duration},"secondslate[$agressive].n"; 380ca92ifexists$workers{$workunits{$k}{worker}}{ 381b5e1DeleteWorker$workunits{$k}{worker}; 3822a9a} 383d64cmy$err={id=>$k,error=>'Timedoutwaitingforworker'}; 384c060DeleteWorkunit$k,'FAILED',to_json$err; 385adb8#FIXME:SendABORTWORKmessagetoworker 3865e04#Tomaximizeuseofworkers,onlyagressivelycanceloneworkunit 387b286#atonce,thiswillallowtheunusedcapacitytobetakenupbyredos 3889867#ofoneworkunitatatime. 389dfeeif$agressive{$lastagressive=$now;last} 390e266} 391d9d4#Dumpworkerlisteveryminute 392f3c6DumpWorkersif$now>=$lastworkerdump+$DUMPINTERVAL; 3932126#Ifthere'sbeenanidleworkerforawhileandwe'veonlytoldone 3940f82#dispatcher,telltherestincasetheyhavework. 395325bNewWorkerif$freeworkersinceand$freeworkersince!=$lastbroadcastand 3969385$now>=$freeworkersince+$DISTTIMEOUT; 397c2b7} 398da39 3993b31subDeleteWorkunit{ 4000b3fmy$id,$status,$json=@_; 401dd93my$dispatcher=$workunits{$id}{source}; 40290a3my$dispatchsock=exists$clients{$dispatcher}? 403848e$clients{$dispatcher}{sock}:undef; 4043a4fif$dispatchsock{ 405d48cprint$dispatchsock"WORK$status$jsonn"; 406e266} 4071970my$wid=$workunits{$id}{worker}; 408ad52ifexists$workers{$wid}{ 409b2b2my$w=$workers{$wid}; 410c9c5$w->{assigned}--unless$w->{assigned}<=0; 41176ef$busythreads--unless$busythreads<=0; 412d641#Ifthisdispatcherdoesn'thavemorework,thisworkerwillremain 413e54e#idleuntilwebroadcastitsexistancetoallotherdispatchers. 414ba4c$freeworkersince=time 4156e2aif!$freeworkersince&&$busythreads<$totalthreads; 4164f5a#Thisdispatcherdisappeared,alertallotherdispatchersimmediately. 417d294NewWorkerif!$dispatchsock; 4185c73#NotifyMonitors'UPDATE','WORKER',$wid;#Clientcanfigureout 419e266} 42001f9#Removeworkunitfromlists 421ed7dformy$i=0;$i<@workunits;$i++{ 422a708nextunless$workunits[$i]eq$id; 423bd7csplice@workunits,$i,1; 424b72alast; 425e266} 4269454NotifyMonitors'DELETE','WORKUNIT',$id,$status; 427c766delete$workunits{$id}; 428c2b7} 429da39 430d9a8subDeleteWorker{ 431dc50my$id=@_; 4324885print"Deletingworker$idn"; 433ae97NotifyMonitors'DELETE','WORKER',$id; 43496a3returnunlessexists$workers{$id}; 4356526my$sockid=$workers{$id}{sock}; 436499aif$clients{$sockid}{sock}{ 437bb96$select->remove$clients{$sockid}{sock}; 438e2ffclose$clients{$sockid}{sock}; 439e266} 440f243$totalthreads-=$workers{$id}{threads}; 441d4d7$totalthreads=0if$totalthreads<0; 442ad50$busythreads-=$workers{$id}{assigned}; 4431f0e$busythreads=0if$busythreads<0; 444a91d$freeworkersince=0if$busythreads>=$totalthreads; 445689edelete$clients{$sockid}; 446c65fdelete$workers{$id}; 109 PAGE 120 447c2b7} 448da39 4498c70#Broadcasttheexistanceofafreeworker 45072d1subNewWorker{ 4515e8ereturnif!$freeworkersince||$lastbroadcast==$freeworkersince; 452ecadforeachvalues%clients{ 4536723warn"Corruptclientlist" 4543d9aunlessref$_eq'HASH'anddefined$_->{kind}; 4552199my$c=$_->{sock}; 456dd29nextunless$_->{kind}==2&&!$_->{idle}; 457b8eaprint$c"NEWWORKERn" 458e266} 4592094$lastbroadcast=$freeworkersince; 460c2b7} 461da39 462c110#Saveadumpofallconnectedworkersformonitoring 463ee5dsubDumpWorkers{ 4644d56unlessopenWF,'>','workers.dat'{ 465b6cfwarn"Can'twriteworkerdump:$!"; 46640a3return; 467e266} 468d76cunlessopenWJ,'>','workers.json'{ 4694bb6closeWF; 4708443warn"Can'twriteworkerJSONdump:$!"; 47140a3return; 472e266} 473855eprintWJ'{'; 4746c2emy$comma=0; 4752528my$now=time; 476dc6fforeachmy$idkeys%workers{ 477a677nextunlessexists$workers{$id}{name}anddefined$workers{$id}{name}; 478118dnextunlessexists$workers{$id}{sock}anddefined$workers{$id}{sock}; 479582fprintWF"$idt$workers{$id}{name}t$workers{$id}{seen}t", 480524e"$workers{$id}{seenwork}t$workers{$id}{threads}t", 481134e"$workers{$id}{ver}n"; 4821ffaif$comma{printWJ','}else{$comma=1} 483b323printWJ"n"$id":",to_json$workers{$id}; 4848862if$workers{$id}{seen}+300<$now{ 4859806my$sock=$clients{$workers{$id}{sock}}{sock}; 486aa5cprint$sock"PINGn"; 4872a9a} 488e266} 489f9d7printWJ"n}n"; 490704fcloseWF; 4913f24closeWJ; 492a75b$lastworkerdump=$now; 493c2b7} 494da39 495e105subNotifyMonitors{ 4967d06returnunless@monitors; 497674dmy$message,$itemtype,$id,$arg=@_; 498fccfmy$prestr="$message$itemtype$id"; 499578cmy$str=''; 50051ae#print$prestr,"n"; 5013422if$messagene'DELETE'{ 50247a8if$itemtypeeq'WORKER'{$str.=''.to_json$workers{$id}} 503fe7celsif$itemtypeeq'WORKUNIT' 5047c67{$str.=''.to_jsonExportWorkunit$id} 505e266} 506830felsif$arg{$str=$arg}; 50729c9$str=''.$strif$str; 508c047foreach@monitors{ 509a54dprint$_$prestr,$str,"n"; 510e266} 511c2b7} 5128d47subExportWorkunit{ 513dc50my$id=@_; 514cfb2return{'worker'=>$workunits{$id}{worker}, 51527bd'starttime'=>$workunits{$id}{starttime}, 5163879'duration'=>$workunits{$id}{obj}{duration}, 517828b'nitems'=>$workunits{$id}{obj}{nitems}}; 518c2b7} File8: distclient.pl 104c4#!/usr/bin/perl 2d08f# 38246#distclient.pl-DistributedComputingClient 4d08f# 5da39 6c4a8##SupportunthreadedPerlviatheforksmodule.WARNING:Veryslow. 72251#useConfig; 87aff#useif!$Config{usethreads}forks; 9e9bfusethreads; 10310ausethreads::shared; 11426auseFindBin; 126de0BEGIN{chdir$FindBin::Bin;push@INC,"$FindBin::Bin/lib"} 133478useSys::Hostname; 147c8euseDigest::MD5; 15e339useJSON; 16851euseIO::Socket::INET; 177e7duseIO::Select; 1840aeuseHTTP::Request; 19de75useHTTP::Async; 2087acuseFile::Temp; 21ce53useFile::Copy; 22f00buseFile::Spec; 23fa32useURI; 242020useArchive::Tar; 25dfe6useTime::HiRes; 26e6ecuseif$^Oeq'MSWin32',Win32::API; 279d9busewarnings; 288428usestrict; 29da39 302accour$VERSION=20120516; 3151femy$USERAGENT="distclient.pl/$VERSION"; 32ccafmy$IPC_PORT=29482; 33da39 3497e9#Loadserverconfiguration 358bc1our$NAME='DistributedComputingClient'; 36bc78our$HOST,$PORT; 374dd8our$SERVERNAME,$SERVERDETAIL; 38e51emy$serverconffn=File::Spec->catfile$FindBin::Bin,'server.conf'; 392e02ifopenF,'<',$serverconffn{ 404e8cmy$buf= PAGE 121 74da39 7539bbmy$dwNumberOfProcessors=unpack'L9',$systemInfo[5]; 76dbf5$processorcount=$dwNumberOfProcessorsif$dwNumberOfProcessors||0>0; 77da39 783112#Determineprocessorarchitecture 799f58my$wProcessorArchitecture=unpack'L9',$systemInfo[0]>>16; 80ef84if$wProcessorArchitecture==9{$platform='win64'}#9=AMD64 815f4felse{$platform='win32'}#0=X86,6=Itanium 82da39 834cad#Getvolumeserialnumberforsystemdirectory,forsystemidentification. 840851#Whileclonable,thisisgoodenoughforcurrentpurposes. 85ceb9#UINTGetSystemDirectoryLPTSTRlpBuffer,UINTuSize; 860558$function=Win32::API->new'kernel32','GetSystemDirectory','PN','N' 871f7eordie"Can'timportGetSystemDirectory:$!"; 88b21dmy$windir=''x1024; 89160dmy$rc=$function->Call$windir,1023; 9017a4die"GetSystemDirectoryfailed:$rc"if$rc<1||$rc>1022; 916fe5if$windir=~/^[A-Za-z]:[/]/{$windir=uc"$1:"} 92d010else{die"GetSystemDirectoryerror,rc=$rcn"} 93da39 94d13c#BOOLGetVolumeInformationLPCTSTRlpRootPathName, 953416#LPTSTRlpVolumeNameBuffer,DWORDnVolumeNameSize, 96ee0b#LPDWORDlpVolumeSerialNumber,LPDWORDlpMaximumComponentLength, 970581#LPDWORDlpFileSystemFlags,LPTSTRlpFileSystemNameBuffer, 9877c1#DWORDnFileSystemNameSize; 9930fe$function=Win32::API->new'kernel32','GetVolumeInformation', 1001e4d'PPNPPPPN','N' 101a841ordie"Can'timportGetVolumeInformation:$!"; 102d833my$volumeSerialNumber=pack'I',0; 103ada2$function->Call$windir,0,0,$volumeSerialNumber,0,0,0,0 1046522ordie"Can'tretrievevolumeinformationn"; 105da35$volumeSerialNumber=unpack'I',$volumeSerialNumber; 1063e72$sysident=sprintf"%s%08x",$windir,$volumeSerialNumber; 107c2b7} 10839e0#OnLinux,wecanuse/proc/cpuinfotogetthenumberofprocessors. 109d858elsif-r'/proc/cpuinfo'{ 110a420#OnLinux,useprocfstofindouthowmanyCPUswehave 111179aopenF,'<','/proc/cpuinfo'ordie"Cannotopen/proc/cpuinfo"; 1125b84while PAGE 122 228e5b1my$downloading:shared=0;#Mutexfordownloadphase 229550bmy$appfiles:shared=0;#Mutexforapplicationfiles 230b28fmy$appfn:shared='';#Currentapplicationfilename 231b122my$appversion:shared=0;#App"version"threadsreloadonchange 232da39 233f6f7our$clientthread=undef;#Socketcommunicationthread 2349becour@workthreads;#Workerthreads 235c5edour%callbacks=#Callbackfunctions 236b4a1poststatus=>undef,#Poststatusmessagetomainthread 237fb86; 2384838my$startup_complete:shared=0; 23919femy$exiting:shared=0; 240da39 24127f6if!caller{ 242c839print"PID:$$n"; 2433fd3$callbacks{poststatus}=sub{ 244b495my$result=@_; 2450f1bprintto_json$result,"n"; 246fdcb}; 247f2b4StartClient; 248259asleep100while1; 2493eefOnExit; 2500057exit0; 251c2b7} 252da39 2530d04#Starttheclientworkerthreadsandmainsocketthread 254a0e4subStartClient{ 2554f6bdie"Statuscallbackisnotdefined."unless$callbacks{poststatus}; 2562419formy$i=1;$i<=$THREADCOUNT;$i++{ 2573d55push@workthreads,threads->create&WorkThread,$i; 258e266} 259299d$startup_complete=0; 26074ca$clientthread=threads->create&SocketThread; 261a5e5{#Waitforstartup 262911dlock$startup_complete; 2637d88cond_wait$startup_completeuntil$startup_complete!=0; 264e266} 26502a3if$startup_complete<0{ 266ef74PostRawStatus{thread=>0,mode=>'ERROR', 267c793error=>"Failedtostartclientthread."}; 2683d57$exiting=1; 269e266} 270c2b7} 271da39 2723f98#Calledbyathreadtosendastatusmessagetothefront-end 273e21bsubPostStatus{ 2740562my$id:shared=shift@_||0; 2751ab7unlessexists$status{$id}{ 276db54$status{$id}=shared_clone{id=>$id}; 277e266} 278c808whiledefinedmy$k=shift@_&&definedmy$v=shift@_{ 27956d7$status{$id}{$k}=$v 280e266} 2811a1c#Monitorspeed 2824de5if$id&&exists$status{$id}{mode}&& 283843b$status{$id}{mode}||''eq'WORKING'{ 28471a9my$progress=$status{$id}{progress}; 28540f5my$range=$status{$id}{range}; 2863fe9ifdefined$range&&defined$progress&&$progress<$range{ 287e818my$dl=$deadline{$id}; 288a089my$now=Time::HiRes::time; 289f29bif$progress&&$dl->{lasttime}&&!$dl->{aborted}{ 2905120#Currentinstantaneousduration/individual 2915319my$curdur=$now-$dl->{lasttime}; 2920374#Expectedduration/individualfromserver 29308f2my$dur=$work{$id}{duration}/$range; 294cb90#Usethemostoptimisticestimate 2957d0e$dur=$curdurif$curdur<$dur; 296757emy$endtime=$now+$range-$progress*$dur; 297a7f3#Willwemeetourdeadline? 2986b3aif$endtime>$deadline{$id}{deadline}{ 299ef8d#Wewon'tbemeetingourdeadline. 300bc09print"Aborting$id,won'tmeetdeadline.n"; 301fae9$dl->{aborted}=1; 302ecd4{lock@abortedwork;push@abortedwork,$id} 303c050} 304d384} 30533a4$dl->{lasttime}=$nowunless$dl->{aborted}; 3062a9a} 307e266} 308f514$callbacks{poststatus}->$status{$id}; 309c2b7} 310da39 31178b0subPostRawStatus{ 3125941my$status=@_; 313be65$callbacks{poststatus}->shared_clone$status; 314c2b7} 315da39 3168fc7#Calledbyfront-endtoconvertastatusmessageintoa 3173db7#textualrepresentation 318e72csubRenderStatus{ 3195941my$status=@_; 3208c4cmy$mode=$status->{mode}||''; 32172b0my$str='Idle'; 322e6baif$modeeq'STARTING'{$str='Startingup...'} 3237303elsif$modeeq'CONNECTING'{$str="Connecting..."} 3249dbcelsif$modeeq'DISCONNECTED'{ 3252560$str=$status->{error}?"Cannotconnect:$status->{error}": 3263c51'Disconnected'; 32773de$str.="$status->{progress}sec"ifdefined$status->{progress}; 328e266} 3292290elsif$modeeq'WAITING' 33023a8{$str=$status->{id}?'Idle':'Connected'} 331c30a#elsif$modeeq'STOPPING'{$self->Close;return} 332b464elsif$modeeq'DOWNLOADING'or$modeeq'UPLOADING'{ 333e1d1$str=sprintf'%sfile%dof%d...', 334b411$modeeq'UPLOADING'?'Uploading':'Downloading', 3350b1d$status->{progress}+1,$status->{range}; 336e266} 33797afelsif$modeeq'WORKING' 338f9b8{$str='Computing'.$status->{id}||'untitled'.'...'} 33975b4elsif$modeeq'ERROR' 3401e89{$str='Error:'.$status->{error}||'unknown'} 3417472elsif$modeeq'FINISHED'{$str='Completed'} 342bdd6return$str; 343c2b7} 344da39 34563f2#Calledbyfront-endtotriggerashutdowntheclient 346e174subDoExit{ 347080fmy@threads=$clientthread,@workthreads; 348ee7b$_->kill'SIGTERM'foreach@threads; 349f71a$exiting=1; 350c2b7} 351da39 3527ff3#Calledbyfront-endwhentheprogramisabouttoexit 353209asubOnExit{ 35401e5return0if$exiting>=2; 355a2beDoExitif$exiting<1; 356080fmy@threads=$clientthread,@workthreads; 3573d2dmy@joined=; 3588ebfmy$i=0;my$wait=1; 3591719$exiting=2; 360f67dwhile$i<20&&$wait{ 36102b9$wait=0; 362d82aformy$j=0;$j<@threads;{ 36304c3$j++,nextif$joined[$j]; 36430d9if$threads[$j]->is_joinable{ 365b257print"JOINn"; 366e7b6$threads[$j]->join; 367684c$joined[$j]++; 368d384} 36953b3else{ 37001c0#Nudgeanythreadsstuckin"Idle"mode. 3714e19{lock@pendingwork;cond_broadcast@pendingwork} 372b209threads->yield; 3734b2e$wait++; 374fc06$j++; 375d384} 3762a9a} 3775e2dif$wait{ 378ddd0print"Waiting...n"; 379536dselectundef,undef,undef,.25; 3802a9a} 3819377else{print"Nowaitn"} 112 PAGE 123 382b285$i++; 383e266} 384bad9chdir$FindBin::Bin;#Toallowtempdircleanuptotakeplace 38579b3return1; 386c2b7} 387da39 388df7dsubhandle_ipc{ 3895b65my$select,$ipc=shift@_,shift@_; 39062a6foreachmy$s@_{ 391980bif$seq$ipc{#IPCserver 392a407print"GotIPCclientn"; 393b97bmy$ipcclient=$s->accept; 3948f6e$select->add$ipcclient; 3953690next; 3962a9a} 3979e11else{#IPCclient 3987b46my$buf=''; 3990c5dmy$rc=sysread$s,$buf,1; 400b101if!$rcor$rc<1{$select->remove$s;close$s;next} 401c4fc$buf=uc$buf; 40278e7my%rpccommands=Q=>'STOPPING',S=>'SHOWWINDOW', 403da7dH=>'HIDEWINDOW'; 404c6a8ifexists$rpccommands{$buf}{ 4057804PostRawStatus{thread=>0,mode=>$rpccommands{$buf}}; 406d384} 4073690next; 4082a9a} 409e266} 410c2b7} 411da39 412a2b0#Threadtohandlesocketcommunicationwiththeserver 41381besubSocketThread{ 4145b5dmy$sock; 4157459local$SIG{TERM}=sub{ 416fb62PostStatusundef,mode=>'STOPPING'; 417d358foreachmy$idkeys%work{ 4188ff9my$json=to_json{id=>$id,error=>'Userabort'}; 4196140print$sock"WORKFAILED$jsonn"if$sock; 4202a9a} 42140b6print"Threadexitingn"; 422d1d4#Cleanup 423e449if$sock{ 4241395close$sock; 4252a9a} 426d573threads->exit; 427fdcb}; 428b55elocal$SIG{HUP}=sub{}; 429fdc6local$SIG{PIPE}='IGNORE'; 430b89blocal$SIG{INT}='IGNORE'; 431da39 4328a64#CreateIPCMessageSocket 433bc2emy$ipc=IO::Socket::INET->newListen=>1,Reuse=>1, 4340107LocalAddr=>'127.0.0.1', 435a98fLocalPort=>$IPC_PORT; 436dde5warn"Failedtocreatemessagesocket:$!"unless$ipc; 4373353$startup_complete=$ipc?1:-1; 438a639{lock$startup_complete;cond_broadcast$startup_complete} 439d8dfsleep5,dieif$startup_complete<0; 440da39 4419f14my$wait=3; 44266c2my$select=IO::Select->new$ipc; 4438316while1{ 444e0b6handle_ipc$select,$ipc,$select->can_read; 445ce86PostStatusundef,mode=>'CONNECTING'; 44635c0$sock=IO::Socket::INET->newPeerAddr=>$HOST,PeerPort=>$PORT; 4471151if!$sock{ 448e638my$error=$!; 449ad5cformy$i=$wait;$i>0;$i--{ 450cb85handle_ipc$select,$ipc,$select->can_read; 4517300PostStatusundef,mode=>'DISCONNECTED',error=>$error, 4520202progress=>$i; 453390esleep1; 454d384} 4558ce7$wait=int$wait*1.4;#Backoffbeforeretrying? 4563690next; 4572a9a} 458827d#Connectionwassuccessful 4596444$select->add$sock; 4601b0c$wait=3;#Resettimeout 461b2e9$gracefactor=1.5; 462fa42{my$oldfh=select$sock;$|=1;select$oldfh} 4637feaPostStatusundef,mode=>'WAITING'; 46476e4my$sockbuf=''; 465abf6my$lastseen=0; 4667d6bSOCKLOOP: 4673389while1{ 468eef3foreachmy$s$select->can_read.25{ 469e044if$sne$sock{handle_ipc$select,$ipc,$s;next} 470dad2my$bytes=sysread$sock,$sockbuf,512,length$sockbuf; 4711f81if!$bytes{ 472bf05PostStatusundef,mode=>'DISCONNECTED', 473a13cerror=>defined$bytes?'Retrying':$!; 47480a5lastSOCKLOOP; 475c050} 476d5d1$lastseen=0; 477fa35while$sockbuf=~s/^.*?r*n//{ 4783216my$l=$1; 4792cf1#print"$ln"; 4804ff3if$l=~m/^HELLO/{ 4810f92print$sock"HELLO$VERSION$hostname$myidentn"; 4822297print$sock"PLATFORM$platformn"; 4835717} 48439feelsif$l=~m/^OK/{ 4856ad7print$sock"THREADS$THREADCOUNTn"; 4865717} 487bfabelsif$l=~m/^WORK.+/{ 488c6fcmy$json=$1; 4895dcemy$obj=from_json$json;#FIXME:Crashproof 490caea#print$sock"WORKREJECTEDn"; 4911019$deadline{$obj->{id}}=shared_clone{}; 492a65d$deadline{$obj->{id}}{deadline}=Time::HiRes::time+ 493077d$obj->{duration}*$gracefactor; 49421ee$deadline{$obj->{id}}{aborted}=0; 4952a08$work{$obj->{id}}=shared_clone$obj; 496726f{ 4975afdlock@pendingwork; 498f14fpush@pendingwork,$obj->{id}; 4991bf4cond_signal@pendingwork; 500160c} 501c78e#PostStatus$obj->{id},mode=>'STARTING'; 502bfbcprint$sock"WORKACCEPTED$obj->{id}n"; 5033ea9DoInhibit; 5045717} 505ba4celsif$l=~m/^ABORTWORKs+S+/{ 506bcd5my$id=$1; 5079e2fifexists$work{$id}&&exists$status{$id}&& 508d10fexists$status{$id}{thread}&& 5097848$status{$id}{thread}{ 51098b8push@abortedwork,$id; 511b311} 5125717} 5137fe5elsif$l=~m/^QUERYWORK/{ 514f340my@k=keys%work; 515ed22my$json=to_json@k; 5164304print$sock"WORKING$jsonn"; 5175717} 5183f17elsif$l=~m/^PING/{ 519bed1print$sock"PONGn"; 5205717} 521c09felsif$l=~m/^PONG/{} 522e85felsif$l=~m/^GOAWAY|QUITs*.*/{ 5233b63PostStatusundef,mode=>'ERROR',error=>$2|| 5242275'Rejectedbyserver,noreasongiven.'; 525a41fPostRawStatus{thread=>0,mode=>'STOPPING'} 526ecb7ifuc$1eq'QUIT'; 5275717} 528de74elsif$l=~m/^GRACEFACTORs+[0-9.]+/{ 5294853$gracefactor=$1+0; 5305717} 531fad0#print"$ln";#FIXME 532c050} 533d384} 5347841#Abortworkunitsiftheywerecanceledbytheserverorifwe 5350a41#won'tbeabletocompletethembeforetheirdeadline. 113 PAGE 124 536f617while1{ 5379be8my$id=undef; 53845a2{lock@abortedwork;$id=shift@abortedwork} 539bba9lastunless$id; 540e8d2ifexists$status{$id}&&exists$status{$id}{thread}{ 541614cmy$thread=$status{$id}{thread}||0-1; 542c073if$thread>=0&&$thread<@workthreads{ 5433786$workthreads[$thread]->kill'SIGHUP'; 5445717} 545c050} 5461e13$deadline{$id}{aborted}=1ifexists$deadline{$id}; 547957bmy$json=to_json{id=>$id, 5487aa4error=>'Abortedbyserverorwillnotmeetdeadline.'}; 549e3f6print$sock"WORKFAILED$jsonn"; 550d384} 551a1ee#Sendresultsforcompletedworkunits. 552f617while1{ 553c7dcmy$w=undef; 554a250{lock@finishedwork;$w=shift@finishedwork} 555835clastunless$w; 5565010my$cmd='WORKFINISHED'; 55798c8$cmd='WORKFAILED'ifexists$w->{error}&&$w->{error}; 5584374delete$status{$w->{id}}; 559f4d9delete$work{$w->{id}}; 5603a81delete$deadline{$w->{id}}; 56122eeprint$sock"$cmd",to_json$w,"n"; 5627408#PostStatusundef; 563d384} 5641f55#Checkpingtime 5650258$lastseen++; 5667beaif$lastseen==2400{print$sock"PINGn"} 567a2d7elsif$lastseen>=2500{ 5685a4cPostStatusundef,mode=>'DISCONNECTED', 5699f1ferror=>'Timedoutwaitingforapingreply.'; 570a3d1lastSOCKLOOP; 571d384} 5721367#FIXME:Trapunexpectederrorsfromthreads 5732a9a} 574bda4#Disconnected. 575d31f$select->remove$sock; 5761557close$sock; 577556c$sock=undef; 578da0b#Trytoreconnect. 5796e0fsleep1; 580e266} 581c2b7} 582da39 583a52bsubWorkThread{ 5844843my$thread=@_; 58530e9my$id; 5867459local$SIG{TERM}=sub{ 587854cprint"Workerexitingn"; 588d1d4#Cleanup 589d573threads->exit; 590fdcb}; 591b55elocal$SIG{HUP}=sub{}; 592fdc6local$SIG{PIPE}='IGNORE'; 593b89blocal$SIG{INT}='IGNORE'; 59479d5my%app=; 5958316while1{ 596f063{#Waitforwork 5976986lock@pendingwork; 598b97bcond_timedwait@pendingwork,time+2 599a089until@pendingworkor$exiting; 600d74ethreads->exitif$exiting; 6011808$id=shift@pendingwork; 6022a9a} 603db37my$obj=$work{$id}; 60492f7eval{ 6055977PostStatus$id,mode=>'STARTING',thread=>$thread; 6062fca#Checkfiles 6074139DownloadFiles$id; 6080386#Getread-lockonapplicationfiles 609551a{ 610e3bflock$appfiles; 611721dcond_wait$appfilesuntil$appfiles>=0; 6124e6a$appfiles++ 613d384} 614ed60my@outfiles; 6157e6fmy$rc=eval{ 61604e7LocalLoadApplication$id,%app; 61744acmy$workfunc=&{$app{work}}; 618b7cf@outfiles=$workfunc->$id,$work{$id}; 619c79d1; 620a9c2}; 6217630{lock$appfiles;$appfiles--;cond_broadcast$appfiles} 6221a1cunless$rc{ 6230c4fdieif$@=~m/WORKFAIL/;#We'vealreadyreportedfailure 624e771WorkFail$id,"Startingwork:$@"; 625d384} 626fe5d#Uploadresults 62754c1PostStatus$id,mode=>'UPLOADING',progress=>0, 62865b5range=>scalar@outfiles; 6293e19my$reply={id=>$id,files=>[UploadFiles$id,@outfiles]}; 6302584#CleanupFiles$id; 631dbd3#PostStatus$id,mode=>'FINISHED',progress=>1,range=>1; 632b2d5PostStatus$id,mode=>'WAITING',progress=>0,range=>1; 633cde9{lock@finishedwork;push@finishedwork,shared_clone$reply} 634ab8d}; 635eae2#print"Finishedn"; 636e266} 637c2b7} 638da39 63957dbsubWorkFail{ 640d6ecmy$id,$message=@_; 6411379PostStatus$id,mode=>'ERROR',error=>$message; 642962cprint"$messagen"; 64351a1lock@finishedwork; 644ae5apush@finishedwork,shared_clone{id=>$id,error=>$message}; 645b1afdie"WORKFAIL"; 646c2b7} 647da39 648e927submd5{ 649310fmy$fn=@_; 650493aopenF,'<',$fnorreturnundef; 651b9cbbinmodeF; 652b778my$md5=Digest::MD5->new->addfile*F->hexdigest; 653fd05closeF; 6548bccreturn$md5; 655c2b7} 656da39 657b3b6suburlhosthack{ 6580a36my$url=@_; 6593cad#Specialhack:URLswithnullhostnameshttp://:9990/fooor 660a1bf#http:///foowilldefaultthehostnametothatofthedistributed 661e6e2#server,e.g.http://server:9990/fooandhttp://server/foo. 662bb71$url=~s/^w+://[:/]/$1$HOST$2/; 6633b0creturn$url; 664c2b7} 665da39 666f6besubDownloadFiles{ 667dc50my$id=@_; 668aa91my$obj=$work{$id}; 669aad1my$async=HTTP::Async->new; 67076e7my%fetch=;my$nfiles=0; 6713357my$got=0; 6729910my$appfn=''; 67363bdmy$appchanged=0; 674f495lock$downloading;#Preventraces 67531c5foreachmy$f@{$obj->{files}}{ 67607e9my$valid,$url,$fn=@$f; 677d8d6ifIsApplication$fn{ 678ded8WorkFail$id,"Workunitprovidedmultipleapplications"if$appfn; 67986e4$appfn=$fn; 6802a9a} 6817263elsif$fn=~m/^[a-zA-Z0-9]+[/]?[-_.a-zA-Z0-9]+$/and 682ddf9!defined$2or!-e$2or-d$2{ 683ca1dmy$dir,$basename=$2,$3; 68446e4mkdir$dirifdefined$dir&&!-d$dir; 6852a9a} 6860d9delse{returnWorkFail$id,"Invalidfilename$fn"} 687108dmy$checksum=md5$fn; 688e41cnextifdefined$checksumand$checksumeq$valid; 689aab0#Checksumdidnotmatch,sodownloadthefile 114 PAGE 125 690c290$nfiles++; 691ec1cif$url=~m/^data:/i{ 69211f7my$u=URI->new$url; 6935a4a#FIXME:Integratetoavoidcodeduplication 694cc3eopenF,'>',$fnorreturnWorkFail$id,"Cannotsave$fn"; 695a74bbinmodeF; 6969ffbprintF$u->data; 697b521closeForreturnWorkFail$id,"Cannotclosesaved$fn"; 6987e75#Verifychecksum 699307bmy$checksum=md5$fn; 700d2f7if$checksumne$valid{ 701d2e5returnWorkFail$id,"Cannotverifychecksumfor$fn"; 702d384} 7032a9a} 7045834else{ 705bc12my$req=HTTP::Request->newGET=>urlhosthack$url; 7062cf2$req->user_agent$USERAGENT; 707ac13my$reqid=$async->add$req;#FIXMEdata:URI 7083aa0$fetch{$reqid}=[$valid,$fn]; 7092a9a} 710e266} 71148d8#Checkapplication 7126bbdWorkFail$id,"Workunitprovidednoapplication"unless$appfn; 7136015unless$nfiles{ 714ace7CheckApplication$id,$appfn; 71540a3return; 716e266} 717e9c0PostStatus$id,mode=>'DOWNLOADING',progress=>0,range=>$nfiles; 7188015whilemy$response,$respid=$async->wait_for_next_response{ 7198a20#Updateprogressbar 720c4a8$got++; 7211023PostStatus$id,progress=>$got; 722739e#Findourinformationaboutthisrequest 7234c15my$valid,$fn=@{$fetch{$respid}}; 724ea77delete$fetch{$respid}; 72582a3#CheckresponseisOK 72613c9if!$response->is_success{ 727810dmy$code=$response->code; 7282157returnWorkFail$id,"HTTP$codewhiledownloading$fn"; 7292a9a} 730caea#Saveresponse 73163f6openF,'>',$fnorreturnWorkFail$id,"Cannotsave$fn"; 7324ebfbinmodeF; 73370d6printF$response->decoded_content; 73451fccloseForreturnWorkFail$id,"Cannotclosesaved$fn"; 73535aa#Verifychecksum 736108dmy$checksum=md5$fn; 737308aif!defined$checksumor$checksumne$valid{ 73875f1returnWorkFail$id,"Cannotverifychecksumfor$fn"; 7392a9a} 740a299$appchanged=1if$fneq$appfn; 741e266} 742d201CheckApplication$id,$appfn,$appchanged; 743c2b7} 744da39 7453674subCheckApplication{ 746f9dbmy$id,$myappfn,$appchanged=@_; 7474254#Checkifweneedtochangeapps 7485cc4if!$appfn||$appchanged||$appfnne$myappfn{ 74982a4#Getwritelockonapplication 7505639{ 75105b7lock$appfiles; 752904ccond_wait$appfilesuntil$appfiles==0; 7535bcc$appfiles-7542a9a} 755ed61my$rc=eval{ 756e126##Theoldappshouldn'tbeabletopreventaswitch,soignore 7578e71##anyerrorsitmayproduce. 75827e1#eval{my$func=&{$app{stop}};$func->}if$app{stop}; 759bcf9#Unpackthenewapp 760f212UnpackApplication$myappfn; 7613db6$appfn=$myappfn; 7629c37##Startthenewapp 763c9a8#if$app{start}{ 76417e7#my$func=&{$app{start}}; 765a75f#$func->; 766c62c#} 76750ca1; 768ab8d}; 7692bce$appversion++; 7707586$appversion=-1unless$rc; 771955e#Releaselockonapplication 772fdbe{lock$appfiles;$appfiles++;cond_broadcast$appfiles} 7738cafWorkFail$id,"Applicationchangefailed:$@"unless$rc; 774e266} 775c2b7} 776da39 7770bb8subLocalLoadApplication{ 7786c62my$id,$app=@_; 7794afdWorkFail$id,"Noapploaded"if$appversion==-1; 780c1e8return 78175fdifdefined$app->{appversion}and$appversion==$app->{appversion}; 782e123print"Loadingappversion$appversionn"; 7839689my$appobj=do'app.pl' 7846ee6orWorkFail$id,"Applicationcodecouldnotbeloaded:$@$!"; 785fb49#Removeolditemsfromapphash 786d810UnsetApplication; 7877cd8#Insertnewitemsintoapphash 7886551foreachkeys%$appobj{$app->{$_}=$appobj->{$_}} 7893a6d$app->{appversion}=$appversion; 790541a#Sanitycheck 7913cfcdie"Applicationcodeprovidesnoworkfunction"unless$app->{work} 792c2b7} 793da39 794b45bsubIsApplication{return$_[0]=~m/^[-_a-zA-Z0-9.]+.app$/} 795da39 7969728subUnsetApplication{ 7977381my$x=@_; 798d161my@x=keys%$x; 799d945for@x{delete$x->{$_}} 800c2b7} 801da39 8021068subUnpackApplication{ 803310fmy$fn=@_; 8048a8a#Unpackthearchive 805e4camy$tar=Archive::Tar->new; 806bff8$tar->read$fnordie"Can'topen$fn:$!"; 807c2e6foreachmy$f$tar->list_files{ 8085f02my@components=split/[/]+/,$f; 809e184shift@componentswhile@componentsand$components[0]eq'.'or 810bb87$components[0]eq''; 811652enextunless@components; 8121a2emy$filearch=$components[0]; 813a238$components[0]='.'; 8149eaa#Checkiffileisincorrectdirectory 8150aaanextunless$filearcheq'all'or$filearcheq'any'or 816cb18$filearcheq$platform; 817f3e6#Verifydirectorycomponentstoensuretherearenoforbidden 818acca#charactersorwords 819501fforeach@components{ 8203aa3if$_=~m/[/?%*:|"<>$-]|^..|^s*$| 821c8c9^CON|PRN|AUX|NUL|COMd+|LPTd+$|./ix{ 82254d0die"Applicationarchivecontainsinvalidfilename:$f"; 823d384} 8242a9a} 82561f2#Extractthefile 826525a$tar->extract_file$f,join'/',@components 8278090ordie"Couldnotextract$ffromapp:$!"; 828e266} 829c2b7} 830da39 831b8b8subUploadFiles{ 832cebdmy$id,$outfiles=@_; 833029emy$url=$work{$id}{upload}; 834aad1my$async=HTTP::Async->new; 835f395my%put=; 8363357my$got=0; 8371c39my@reply=; 8381e4cforeachmy$f@$outfiles{ 83919ccmy$buf=''; 840e6b6openF,'<',$forWorkFail$id,"Failedtoreadoutputfile$f"; 841cbc5binmodeF;#FIXMEbufferedIO? 842cc4d1whilesysreadF,$buf,512,length$buf; 843e4cbcloseF; 115 PAGE 126 84417e3my$checksum=Digest::MD5::md5_hex$buf; 845bf98if$url=~m/^data:/{ 84618fc#"Upload"todata:URI.IdeallyLWPcoulddothisforme, 8474548#soIwouldn'thavetoduplicatecode. 848a484my$u=URI->new'data:'; 8496893$u->data$buf; 85048e3$got++; 851680cPostStatus$id,progress=>$got; 85253f7push@reply,[$checksum,$f,$u.'']; 8535675unlink$f;#Deletesuccessfullyuploadedfile 8542a9a} 8555834else{ 856b57cmy$req=HTTP::Request->newPUT=>urlhosthack$url; 8572cf2$req->user_agent$USERAGENT; 858a02d$req->content$buf; 8599992$req->encode'gzip'iflength$buf>1024; 86087bemy$reqid=$async->add$req; 861a84b$put{$reqid}=[$checksum,$f]; 8622a9a} 863e266} 8648015whilemy$response,$respid=$async->wait_for_next_response{ 8658a20#Updateprogressbar 866c4a8$got++; 8671023PostStatus$id,progress=>$got; 868739e#Findourinformationaboutthisrequest 869d589my$resp=$put{$respid}; 8706659delete$put{$respid}; 87182a3#CheckresponseisOK 87213c9if!$response->is_success{ 873810dmy$code=$response->code; 87472f6returnWorkFail$id,"HTTP$codewhileuploading$resp->[1]"; 8752a9a} 87650baunlink$resp->[1];#Deletesuccessfullyuploadedfile 877c866my$c=$response->content; 8788116$c=~s/s//g; 879a29epush@reply,[@$resp,$c]; 880e266} 881a85dreturn@reply; 882c2b7} 883da39 8848e70#AppisresponsibleforcleaninguptransientfilesFIXME? 885e34a#subCleanupFiles{ 8866183#my$id=@_; 887a51d#my$obj=$work{$id}; 8888e38#foreachmy$f@{$obj->{files}}{ 8890b74#my$valid,$url,$fn=@$f; 890c7a6##FIXME:Thefollowingistoomuchofahack,needbetterscheme. 8914460##unlink$fnif$fn=~m/^temp/; 89228ed#} 893ee77#} 894da39 8951dcd1; File9: distclientplain.pl 104c4#!/usr/bin/perl 2d08f# 3699d#distclientplain.pl-Textinterfaceforga-spectroscopydistributorclient. 4d08f# 5da39 6e9bfusethreads; 7310ausethreads::shared; 8426auseFindBin; 96de0BEGIN{chdir$FindBin::Bin;push@INC,"$FindBin::Bin/lib"} 109d9busewarnings; 118428usestrict; 12da39 133248my$DEBUG=@ARGV&&$ARGV[0]=~m/debug|all/i; 14da39 151879subcatch_zap{print"Exitingn";OnExit;exit} 160704$SIG{INT}=&catch_zap; 17da39 18c749our$THREADCOUNT; 19eb1eif!do'distclient.pl'{ 20ac68print"$@n$!n"; 219094exit1; 22c2b7} 23da39 248178$|=1; 25236emy@events:shared=;my@last=; 267e57our%callbacks; 27fb85$callbacks{poststatus}=sub{ 28b5cdmy$result=@_; 293a2elock@events; 30239cpush@events,$result; 31f6bb#printto_json$result,"n"; 3272c5cond_signal@events; 33479f}; 34e5dcStartClient; 35c2cewhile1{ 36817amy$st=undef; 373a55{#Waitforwork 38b3a1my$got_zap=0;local$SIG{INT}=sub{$got_zap++}; 39cd36lock@events; 404cafcond_timedwait@events,time+2until@eventsor$got_zap; 41f53e$st=shift@eventsunless$got_zap; 42e266} 436535lastif!$stor$st->{mode}eq'STOPPING'; 44fad5my$thr,$mode,$now=$st->{thread}||0,$st->{mode},time; 45251emy$str=sprintf"[%2d]%11s%s",$thr,$mode,RenderStatus$st; 46328c$str.=sprintf"%04d/%04d",$st->{progress},$st->{range} 47942dif$st->{progress}&&$st->{range}&&$st->{progress}<$st->{range}; 4859b2if$DEBUG{printlength$str.":'$str'n"} 496e2delsif$modene'WORKING'or$last[$thr]+2<$now 506ba4{print"$strn";$last[$thr]=$modeeq'WORKING'?$now:0} 51c2b7} 52da39 532852OnExit; 54578cexit0; 55da39 File10: gaspecclient.pl 1d08f# 296f2#gaspecclient.pl-DistributedComputingClientforga-spectroscopy. 3d08f# 47cb7#Thisfileisdynamicallyreloadedbyeachworkerthreadwheneverthe 5b5de#applicationchanges.PERLTHREADSAREBUGGY:KEEPTHISCODESIMPLE. 6d08f# 7e89bpackageDistClient::GaSpectroscopy; 8310ausethreads::shared; 99d9busewarnings; 108428usestrict; 11da39 12ce89my$HAVE_Win32_Process=0; 132c3deval'useWin32::Process;$HAVE_Win32_Process=1'; 14da39 15a9casubWork{ 163eedmy$id,$obj=@_; 17fd81#my$obj=$work{$id}; 18e111my$TEMPDIR=File::Spec->catdirFile::Spec->curdir,'temp'; 19da39 20aa29#Createinputfilefromconffileandpopfile 210016my$conffile=$obj->{files}[-2][2]; 22c46emy$popfile=$obj->{files}[-1][2]; 232b29my$nitems=0; 243597my$fh,my$infn= 25d688File::Temp::tempfile"temp-input-XXXXX",DIR=>$TEMPDIR, 261edfOPEN=>1,EXLOCK=>0,UNLINK=>0; 277c57foreachmy$srcfn$conffile,$popfile{ 28def0openIN,'<',$srcfnor 29b7e3returnmain::WorkFail$id,"Failedtoread$srcfn"; 30e6fawhile PAGE 127 3211f5print$fh$_; 332a9a} 34e266} 35ab3acloseIN; 3676c6close$fh; 37b8ac#Wecangetridofthepopulationfilenow,sinceit'suniquetothis 3837a0#workunit. 39aea9unlink$popfile; 40da39 41278e#CreateoutputfileFIXME:Thisshouldallhappeninarealtempdir 42b051undef,my$outfn= 434253File::Temp::tempfile"temp-output-XXXXX",DIR=>$TEMPDIR, 44f888OPEN=>0,EXLOCK=>0,UNLINK=>0; 45da39 46e3cd#Startgaspecprocess 470626my$i=0; 48548emain::PostStatus$id,mode=>'WORKING',progress=>0,range=>$nitems; 492f82my$logbuf="ERRORLOGFOLLOWSSomethingwentwrongn"; 50483dmy$rc; 5135f4my$kill=0; 525403my$oldsigterm=$SIG{TERM}; 53ed08{ 540dc8my$pid=0; 55b070#Ifwedon'tshutdownthesubprocessbeforestoppingthethread, 5652ad#Win32Perlcrashes. 570cc9local$SIG{TERM}=sub{$kill=2}; 589fcflocal$SIG{HUP}=sub{$kill=1}; 598335#print"$idStarting./ga-spectroscopy-client"$infn""$outfn"n"; 6031d8my$program=File::Spec->catfileFile::Spec->curdir, 613467'ga-spectroscopy-client'; 62022bmy$cmd="$program"$infn""$outfn""; 63df74$cmd="nice-n19$cmd"if$^One'MSWin32'; 643679$pid=openSPEC,'-|',$cmd 654931ormain::WorkFail$id,'Cannotstartgaspecclient'; 669d82#NicetheprocessonMSWin32. 67e386main::Win32Nice$pid; 6811e2{my$oldfh=selectSPEC;$|=1;select$oldfh} 69fb97#print"$idReadingn"; 7031e3while PAGE 128 Bibliography [1]H.M.Pickett,Thettingandpredictionofvibration-rotationspectrawithspin interactions," JournalofMolecularSpectroscopy ,vol.148,no.2,pp.371{377, 1991. [2]W.L.MeertsandM.Schmitt,Anewautomatedassignandanalysingmethod forhigh-resolutionrotationallyresolvedspectrausinggeneticalgorithms," PhysicaScripta ,vol.73,pp.C47{C52,2006. [3]M.SchmittandW.L.Meerts,Rotationallyresolvedelectronicspectroscopyand automaticassignmenttechniquesusingevolutionaryalgorithms,"in Handbook ofHigh-resolutionSpectroscopy M.QuackandF.Merkt,eds.,pp.1345{1371, JohnWiley&Sons,Ltd,2011. [4]I.A.Finneran, TheMicrowaveSpectroscopyofSmallMoleculeswithMethyl Rotors .Bachelor'sthesis,NewCollegeofFlorida,Sarasota,Florida,May2011. [5]D.E.Goldberg, Geneticalgorithmsinsearch,optimization,andmachinelearning .Reading,Massachusetts:Addison-Wesley,1989. [6]D.S.JohnsonandL.A.McGeoch,Thetravelingsalesmanproblem:Acase studyinlocaloptimization,"in LocalSearchinCombinatorialOptimization. 118 PAGE 129 E.H.L.AartsandJ.K.Lenstra,eds.,Wiley-Interscienceseriesindiscrete mathematicsandoptimization,pp.215{310,Wiley,1997.Section8.6.2. [7]D.L.Applegate,R.E.Bixby,V.Chvatal,andW.J.Cook, TheTravelingSalesmanProblem:AComputationalStudy ,ch.2.4,pp.69{75.Princetonseriesin appliedmathematics,PrincetonUniversityPress,2006. [8]R.Poli,W.B.Langdon,andN.F.McPhee, AFieldGuidetoGeneticProgramming .Lulu.com,2008. [9]F.Gray,Pulsecodecommunication."USPatent2,632,058,March1953. [10]B.Reinhold,I.A.Finneran,andS.T.Shipman,Roomtemperaturechirpedpulsefouriertransformmicrowavespectroscopyofanisole," JournalofMolecular Spectroscopy ,vol.270,no.2,pp.89{97,2011. [11]SETI@home." http://setiathome.berkeley.edu/ .Accessed18April2012. [12]Folding@home{Main." http://folding.stanford.edu/ .Accessed18April 2012. [13]E.McIntoshandA.Wagner,CERNmodularphysicsscreensaverorusingspare CPUcyclesofCERN'sdesktopPCs,"in ComputinginHighEnergyandNuclear Physics ,p.1055,2004. [14]GreatInternetMersenneprimesearch." http://www.mersenne.org/ .Accessed 18April2012. [15]D.P.Anderson,BOINC:Asystemforpublic-resourcecomputingandstorage," in GridComputing,2004.Proceedings.FifthIEEE/ACMInternationalWorkshopon ,pp.4{10,IEEE,2004. 119 PAGE 130 [16]S.Yi,E.Jeannot,D.Kondo,andD.Anderson,Towardsreal-time,volunteer distributedcomputing,"in Cluster,CloudandGridComputingCCGrid,2011 11thIEEE/ACMInternationalSymposiumon ,pp.154{163,IEEE,2011. [17]AmazonWebServices, AmazonElasticComputeCloudUserGuide [18]AmazonEC2microinstancesdeeperdive{HuanLiu'sblog." http://huanliu. wordpress.com/2010/09/10/amazon-ec2-micro-instances-deeper-dive/ Accessed24April2012. [19]M.J.Frisch,G.W.Trucks,H.B.Schlegel,G.E.Scuseria,M.A.Robb,J.R. Cheeseman,J.A.Montgomery,Jr.,T.Vreven,K.N.Kudin,J.C.Burant,J.M. Millam,S.S.Iyengar,J.Tomasi,V.Barone,B.Mennucci,M.Cossi,G.Scalmani,N.Rega,G.A.Petersson,H.Nakatsuji,M.Hada,M.Ehara,K.Toyota, R.Fukuda,J.Hasegawa,M.Ishida,T.Nakajima,Y.Honda,O.Kitao,H.Nakai, M.Klene,X.Li,J.E.Knox,H.P.Hratchian,J.B.Cross,V.Bakken,C.Adamo, J.Jaramillo,R.Gomperts,R.E.Stratmann,O.Yazyev,A.J.Austin,R.Cammi, C.Pomelli,J.W.Ochterski,P.Y.Ayala,K.Morokuma,G.A.Voth,P.Salvador, J.J.Dannenberg,V.G.Zakrzewski,S.Dapprich,A.D.Daniels,M.C.Strain, O.Farkas,D.K.Malick,A.D.Rabuck,K.Raghavachari,J.B.Foresman,J.V. Ortiz,Q.Cui,A.G.Baboul,S.Cliord,J.Cioslowski,B.B.Stefanov,G.Liu, A.Liashenko,P.Piskorz,I.Komaromi,R.L.Martin,D.J.Fox,T.Keith, M.A.Al-Laham,C.Y.Peng,A.Nanayakkara,M.Challacombe,P.M.W.Gill, B.Johnson,W.Chen,M.W.Wong,C.Gonzalez,andJ.A.Pople,Gaussian 03,RevisionE.01."Gaussian,Inc.,Wallingford,CT,2004. [20]S.T.ShipmanandB.H.Pate,Newtechniquesinmicrowavespectroscopy," in HandbookofHigh-resolutionSpectroscopy M.QuackandF.Merkt,eds., pp.801{827,JohnWiley&Sons,Ltd,2011. 120 PAGE 131 [21]B.Reinhold, MicrowaveSpectroscopyofMono-SubstitutedBenzenesatRoom Temperature .Bachelor'sthesis,NewCollegeofFlorida,Sarasota,Florida,May 2010. [22]S.T.Shipman,J.L.Neill,R.D.Suenram,M.T.Muckle,andB.H.Pate, Structuredeterminationofstrawberryaldehydebybroadbandmicrowavespectroscopy:Conformationalstabilizationbydispersiveinteractions," TheJournal ofPhysicalChemistryLetters ,vol.2,no.5,pp.443{448,2011. [23]D.J.Ram,T.H.Sreenivas,andK.G.Subramaniam,Parallelsimulatedannealingalgorithms," Journalofparallelanddistributedcomputing ,vol.37,no.2, pp.207{212,1996. 121 |