|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Full Text | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PAGE 1 IDENTIFYINGMELANOMAUSINGCOMPUTERVISIONANDAN ARTIFICIALNEURALNETWORK BY HANNAHRIVERS AThesis SubmittedtotheDivisionofNaturalSciences NewCollegeofFlorida inpartialfulllmentoftherequirementsforthedegreeBachelorofArts UnderthesponsorshipofPatMcDonald Sarasota,Florida April,2012 PAGE 2 Contents 1Introductiontotheproblem1 1.1Factsaboutmelanoma..........................1 1.2Backgroundinformationonhumanclassicationofmelanoma....2 1.3Backgroundinformationonmachinelearningapproaches.......4 2Computervision8 2.1Anintroductiontocomputervision...................8 3Theoryofneuralcomputation10 3.1Anintroductiontoneurocomputing...................10 3.2Biologicalneuralnetworks........................11 3.3Articialneuralnetworks.........................13 3.4Typesofproblemsforwhicharticialneuralnetsarebestsuited...15 3.5Whyanarticialneuralnetworkisagoodtforthisproblem....17 4Articialneuralnetworks18 4.1Necessaryconcepts............................18 4.2Theperceptron..............................22 4.3Themulti-layerperceptron........................25 5Consideredandselectedapproaches31 5.1Questionsofintent............................31 5.2Preprocessingtechniques.........................32 5.3Selectionoffeatures...........................33 5.3.1ABCDrule(Stolzmethod)....................35 5.3.2Menzies'method.........................37 5.3.3Seven-pointchecklist.......................37 5.3.4Formulatedmethodusedinthisthesis.............38 ii PAGE 3 5.4Practicallimitations...........................41 5.5Software..................................43 5.5.1Computervision.........................43 5.5.2Articialneuralnetwork.....................43 6Finalproduct47 6.1Results...................................47 6.1.1Projectresults...........................47 6.1.2Similarprojectresults......................50 6.2Possibleimprovements..........................51 7Codeandsampleoutput56 References76 iii PAGE 4 ListofFigures 1Aneuron..................................11 2Articialneuron..............................14 3Gradientdescent.............................18 4Theperceptronfunction.........................22 5Multi-layerperceptron..........................25 6Stolz'sABCDruleforidentifyingmelanoma..............36 7Colordetection..............................39 8Cornerdetection.............................39 9Structuredetection............................40 10FirststepofXORfunctionnetworktrainingforrstinput......46 11Results...................................48 12TheHaarwavelet.............................52 13Summedareatable............................53 14Flowofcode................................55 iv PAGE 5 IDENTIFYINGMELANOMAUSINGCOMPUTERVISIONANDAN ARTIFICIALNEURALNETWORK HannahRivers NewCollegeofFlorida,2012 ABSTRACT Melanomaisanincreasinglycommonandveryseriousformofcancer,claiming almostftythousandlivesannually.Thesinglemostimportantfactorinsuccessful treatmentofmelanomaisearlydetection. Inthisthesis,anarticialneuralnetworkdesignedtodistinguishmelanomafrom dysplasticneviwascreatedandtrained.Heuristicsemployedbydermatologiststo diagnosemelanomaweresynthesizedintoapreprocessingmethodthatextractsrelevantfeatures,transforminganimageintoafeaturevector.Thesevectorswerethen passedtothearticialneuralnetworksoitcouldlearnpatternsinthedata. Thearticialneuralnetworkwastrainedonanamalgamationofimagesfromtwo publiclyavailabledatasetsconsistingofimagesof101melanomaand129benign melanocyticlesions.Becausetheperformanceofarticialneuralnetworksisstochastic,resultswereaveragedoveravarietyofparameterandstartingvalues.Onaverage, thenetworkiscapableofdistinguishingbetweenthelesionswith64%accuracy,with 53%sensitivityand73%specicity.Atbest,itsaccuracyis78%,with86%sensitivity and68%specicity. PatMcDonald DivisionofNaturalSciences v PAGE 6 1Introductiontotheproblem 1.1Factsaboutmelanoma 1 ThemostcommonformofcancerintheUnitedStatesisskincancer,a! ecting betweentwoandthreemillionpeoplegloballyeachyearandcomprisingathirdof alldiagnosedcasesofcancer.AccordingtotheWorldHealthOrganization,onein everyveAmericanswilldevelopskincancerintheirlifetime.Melanomaisatype ofskincancer;specicallyitisacancerofmelanocytes,thecellsresponsibleforthe pigmentationofskin.Whileitisamongtheleastcommontypes,itisthemostserious; accordingtotheNationalInstitutesofHealthroughly160,000casesarediagnosed globallyeachyear,leadingto48,000melanoma-relateddeaths.Theincidenceofboth non-melanomaandmelanomaskincancershasbeenincreasingoverthepastdecade. Althoughserious,melanomacanbecuredifitisdiagnosedandtreatedearly.If, however,itisnotremovedinitsinitialstages,thecancercellscanexpanddownward fromtheskinsurface,invadinghealthysurroundingtissueandorgans.Oncethishas occurred,itisverydi"culttocontrol.Consequently,themostimportantaspectof melanomatreatmentisearlyrecognition. Obviously,itisprudenttobecautiouswhenitcomestoabnormalmelanocytic lesions.Unfortunately,thiscautioncomesataprice;healthcareisexpensive.Many peoplecannota! ordtoconsultadoctororadermatologistateverysuspicion.Mitigatingthisconcernisthehopefuleventualityoftheapplicationofthisthesisand otherworkslikeit.Ingeneral,thetrendofmodernmedicineistocreatemachines thatperformtaskspreviouslyperformedbyhumans,ideallywithgreateraccuracyin lesstimeusinglessresources.Thetypeofworkinthisthesisistherststepdown thatroad. 1 FactsfromthewebsitesoftheNationalCancerInstitute,theWorldHealthOrganization,and theMayoClinic.[6][7][3] 1 PAGE 7 Onthatnote,itiscriticallyimportanttoissuethefollowingdisclaimer: Theintentofanyapplicationdirectlybasedonthisthesisistoserveas aguidetoindicatethelikelihoodofmelanoma.Itisexclusivelyaprimary recognitiontool,andisinnowaymeanttobeusedasadiagnostictoolor asareplacementforadoctor.Ifyouhaveseriousconcernsthatyoumay havemelanoma,itisimperativethatyouseekqualiedmedicaladvice. 1.2Backgroundinformationonhumanclassicationofmelanoma 2 Tobothaidinthediagnosisofandassistthepublicinrecognizingmelanoma, doctorsdevelopedtheheuristicacronym"ABCDE."Eachletterrepresentsacharacteristicindicativeofamelanoma,andmeetingoneormoreofthesecriteriamay indicateitsexistence.Therstcharacteristicisasymmetry.Commonnevi(benign, commonlesions,otherwiseknownasmoles)aregenerallysymmetricalcircles,while theshapeofamelanomaisoftenirregular.Thesecondcharacteristicregardsthe borderofthelesion.Thebordersofcommonneviareusuallyverywelldenedand obvious,fadingoutgradually,whileamelanomamayhavearagged,blurred,sharp, orotherwiseirregularborder.Thethirdcharacteristiciscolor.Typically,common neviareasingle,uniformshadeofmediumtodarkbrown.Melanomasmaybea varietyofcolorsincludingblack,red,white,orevenslightlyblue.Additionally,their colormaynotbeconsistent,butratherform"patches."Dependingonyoursource,the fourthdistinguishingcharacteristiciseitherdiameterordi! erentialstructures.While commonneviarealmostinvariablyonetothreemillimetersacross,amelanomais commonlylargerthansix.Additionally,melanomasmaycontainstructuresunseen incommonnevi,suchasglobules,streaks,andpatterns(seeSection5.3onpage33). 2 FactsfromthewebsiteoftheNationalCancerInstitute.[6] 2 PAGE 8 Thelastandmostindicativecharacteristicisevolutionofthelesion.Anychangesin shape,border,color,size,orstructurearehighlyindicativeofamelanoma,asisthe presenceofpain,itchiness,bleeding,oozing,orgeneralirritation. Duetothetime-sensitivenatureofmelanomatreatment,doctorsrecommend monthlyself-examinationstoinspectforskinlesionsttingtheABCDErule.It isimportanttofamiliarizeyourselfwithyourlesionssoastorecognizechanges.The useofahand-heldmirrorisrecommendedtoinspecthard-to-seeregions.Melanoma canoccuronthescalp,groin,ngernails,thesolesoffeet,andintheareabetween toes,soitisimperativetobethorough. Ifasuspiciouslesionisfound,thenextstepistocontactadermatologistor physician.Iftheysuspectthatit'spossiblyamelanoma,theywillbiopsythelesion andsendthesampletoapathologistforclassication.Therearefourtypesofbiopsies: shave,punch,incisional,andexcisional.Therstinvolvesshavingthelesiono!with athinsharpblade.Thesecondinvolvesatoolwithacircularblade,usedtoremove aroundsectionofskincontainingthelesioninquestion.Inthethird,onlythemost irregularportionofthegrowthisremovedforanalysis.Inthefourth,theentirelesion issurgicallyremoved.Theseareoftenperformedwhenthelikelihoodofmelanomais highandtimeisoftheessence.Withinafewdays,thepathologistwillinspectthe sampleandempiricallydeterminewhetherornotitiscancerous. Inthecaseofapositivediagnosis,theensuingactionistoquantifytheseverity ofthecondition.MelanomaisstagedbetweenIandIV,whereastageImelanoma issmallwithapredominantlysuccessfultreatmentrateandastageIVmelanoma hasspreadtootherorgansandsubsequentlyhasaverylowlikelihoodofsuccessful treatment.Thepathologistassignsthestagethroughtwomaincriteria:thickness andtheresultsofasentinelnodebiopsy.Thethicknessofalesioncanbemeasured underamicroscope.Becausemelanomasexpandrsthorizontallythendownwards intotheskin,thehigherthemeasurethemoreseriousthedisease.Asentinelnode 3 PAGE 9 biopsyisatestconductedbythedoctorwhichdetermineswhethercancercellshave spreadtonearbylymphnodesbyinjectingdyeintotheareaatwhichthemelanoma wasremoved,thenchartingitscoursethroughthelymphaticsystem.Therstnodes tocollectdyeareremovedandtestedforcancercells.Iftheyarenotpresentinthe nearestlymphnodes,itisunlikelythatthemelanomahasspread. Thedi!erentiationofearlymelanomafromdysplasticnevi(uncommonbenign lesionsthatresemblemelanoma)isdi" cultevenforexperienceddermatologists.In fact,underestimationoftheseverityofmelanomainitsearlystagesisdisconcertingly notuncommon.However,statisticallythemuchlargerissueisthepatientfailingto noticeortoconsultadoctorintime.Inthemajorityofcases,ifthepatientbrings thelesiontothedoctor'sattentionearly,melanomacanbeproperlyidentiedand treated. 1.3Backgroundinformationonmachinelearningapproaches Medicalimagerecognitionisnotanewapplicationfortheeldofmachine learning.Therehavebeennumerouspreviousendeavorsofskinlesionclassication (someofwhichwillbediscussedinSection6.2)andsimilarproblemswithvarying methodsandvaryinglevelsofsuccess,usuallydirectlycorrelatedwithaccessibilityof nancialresources. Despitetheirdi! erencesinmethods,allmachinelearningapproachestomedical imageclassicationnecessarilyshareasetoffundamentalcharacteristics,consisting ofimageacquisition,featureselectionandextraction,imageprocessingandanalysis, andclassication.Therststepinanyapproachistheacquisitionofasetofimages onwhichtotrainthemodel.Thenextistheextractionofpertinentdatafromthese images.Finally,saiddatamustbeprocessedinsuchamannerthatanalysisofthe resultsyieldsaclassication.Inmachinelearning,theprocessingisdonethrough aseriesofadaptivealgorithmsthatallowthemodeltoevolveitsbehaviorsbased 4 PAGE 10 onrelationshipsandpatternspresentinthedata.Thisconducesmoreintelligent decisionmakingthancanbedonewithastaticalgorithm. Ineachoftheaforementionedareas,thedetailsofoperationvary.Forinstance,the primaryquestionofimageacquisitioniswhattypeofimagestheprogramisultimately goingtobeusedtoclassify.Thesearetheimagesonwhichitisnecessarytotrainthe network.Thenextstepistodeterminewhereasetoftheseimagescanbeattained, keepinginmindthatsupervisedlearningrequiresadatasetwithknownoutcomes. Next,decisionsmustbemaderegardingfeatureselectionandextraction.Thesuccess ofanimagerecognitionprogramdependsonthecorrectselectionofdistinguishing featuresforclassication.Thefeaturesmustbequantiablymeasurableandhave bothhighsensitivity(theproportionofcorrectlyidentiedpositiveresponses)and highspecicity(theproportionofcorrectlyidentiednegativeresponses).Forthe problemofmelanomaidentication,sensitivityisbyfarthemoreimportantvalue. Itismuchmorepreferabletomistakenlyclassifyanegativevalueaspositivethan viceversa.Inthespeciceldofskinlesionclassication,diagnosticmethodssuch astheABCDrule(alsoknownasStolz'smethod),Menzies'method,andthesevenpointchecklisthavebeendevelopedwithe! ectivefeatureselectiontechniques.These methodsarefurtherexplainedinSection5.3.Oncethefeatureshavebeenselected, thenextstepistodeterminethemethodthroughwhichtheywillbeextracted.Many freewareandopensourcecomputervisionlibrariesandsoftwaresuitesforthispurposealreadyexist,ofvaryinglevelsofestablishment.Pendingfeatureextraction,the nextstepistoprocesstheextracteddata.Eachoftheaforementioneddiagnostic methodshasacorrespondingsetofrulesthroughwhichtoprocessfeaturedatainto information.Typically,thesefallunderthecategoryofstandardoptimizationproblemsthatcanberesolvedwithheuristicstrategies,greedyorgeneticalgorithms,other computationalintelligencemethodssuchasstrategiesderivedfromstatisticalpattern recognition,orinthiscase,articialneuralnets.Thenalstepoftheprocessisto 5 PAGE 11 taketheinformationgeneratedinthepreviousstepanddecideforwhichthresholds itshouldbeclassiedintodi! erentcategories,aswellaswhichcategoriestouse.For certainalgorithms,thisisautomaticallyperformedinthepreprocessingstep.Unsupervisedalgorithmssuchasclusteringandblindsignalseparationdetermineontheir ownwhichcategoriesproperlydividethedataset. Fortheproblemofskinlesionclassication,themostsubstantialissueisthatof trainingimageacquisition.Thisistheareainwhichbudgethasthemostlimiting e! ect.Ideally,epiluminencemicroscopy(ELM)canbeused.Otherwiseknownas dermoscopy,ELMistheexaminationofskinlesionswithadermatoscope.Thisprocessconsistsofamagnier,typicallyx10,anon-polarizedlightsource,atransparent plate,andaliquidmediumbetweentheinstrumentandtheskin,whichallowsdetailedinspectionofskinlesionsunobstructedbyreectionsontheskin'ssurface.It isusedtorendertheepidermistranslucent,makingthedermalfeaturesclearlyvisible.Dermoscopyimagesallowdetectionofearlymelanomasassmallas1mm 3 ;afeat impossiblewithstandardclinicalimages.Anotheroptionistransmissionelectronmicroscopy(TEM),asecondmicroscopytechniquewhichrevealsthestructureofelastic networksinthedermis.Abeamofelectronsistransmittedthroughanultra-thin specimen,whichtheelectronsinteractwithastheypassthrough.Fromthisinteractionanimageisformed;thatimageismagniedandfocusedontoauorescent screenonalayerofphotographiclm.Anotheroption,computedtomography(CT) scansusedigitalgeometryprocessingtogenerateathree-dimensionalimageofthe insideofanobjectfromalargeseriesoftwo-dimensionalX-ray"slices"takenarounda singleaxisofrotation.Multi-frequencyelectricalimpedancetomographycancreate animageofalesionbyrecordingandanalyzingtheresistancewithwhichelectric wavesofdi! erentfrequenciesmovethroughit.However,noneofthesemethodsare plausiblewithoutaprofessionalbudget.Instead,clinicalimagesareused.Acommon 3 FromShereaM.Stricklinetal."Melanomainsituinaprivatepracticesetting2005through 2009:Location,lesionsize,lackofconcern"[26] 6 PAGE 12 economicaltechniqueistousevideocamerasthatcanbecontrolledandparameterizedinrealtimeforamorethree-dimensionalanddetailedimagewithrelativelyhigh colorconstancy.Althoughthisisarelativelyinexpensivemethod,neuralnetworks areonlyusefulonthetypeofimagesonwhichtheyweretrained,soforanapplication intendedtobeusedbythegeneralpublicthemostreadilyavailableimagingdevices arehigh-resolution,low-distortioncameras.Unfortunately,theydon'taccountfor colorconstancy,andit'sdi" culttogetalarge,centered,focused,anddetailedimage ofthelesion.Regardless,itisthemostplausibleoption,andsubsequentlytheone pursuedinthisthesis. 4 4 FromIliasMaglogiannisandCharalamposN.Doukas,"OverviewofAdvancedComputerVision SystemsforSkinLesionsCharacterization"[20] 7 PAGE 13 2Computervision 5 2.1Anintroductiontocomputervision Computervisionisaeldofcomputersciencethattriestoprogramcomputersto "see,"orcomprehendvisualdata.GaryBradskiandAdrianKaehlerdeneitas"the transformationofdatafromastillorvideocameraintoeitheradecisionoranew representation,performedinthecontextofaspecicpurposeortask."Theydenea decisionasoutputofthetype"anobjectexistsinthisimage"andrepresentationas outputsuchasremovingcameramotionfromanimagesequence.Theinputdatamay includesomecontextualinformationknownasmetadatathatcanaidthecomputer initstaskbyprovidingpreliminarydiscriminators. Althoughit'sanincrediblycomplexprocess,processingvisualinformationisinherentlynaturaltous.Thehumanbraindividesavisualsignalintomanychannels thatstreamdi! erentkindsofinformationintodi! erentareasofthebrain,suchas anattentionsystemthatidentiesimportantsectionsofanimagewhilesuppressingexaminationofotherareas.Widespreadassociativeinputsfrommusclecontrol sensorsandalltheothersensesallowthebraintodrawoncross-associationsgained fromexperience.Feedbackloopsinthebraingobacktoallstagesofprocessing, includingthehardwaresensorsthemselveswhichmechanicallycontrollightingvia pupildilationandtunethereceptiononthesurfaceoftheretina. Forcomputers,thisisnotanaturalorinherentprocess.Thecomputerreceives agridofnumbers,yethasnoneoftheinherentpatternrecognitionskillsorcrossassociationsthatallowandrenehumanvisualperception.Additionally,anygiven numberinthegridhasalargenoisecomponent.Thus,ourtaskbecomestotransform thenoisygridofnumbersintotheperceptionofanobjectofaspecictype. Formally,thisisimpossible.Neitherauniquenoradenitivesolutionexiststo 5 FromGaryBradskiandAdrianKaehler"LearningOpenCV:ComputerVisionwiththeOpenCV Library"[11] 8 PAGE 14 theproblemofreconstructinga2dimensionalviewofa3dimensionalworldintoits thirddimension.Forinstance,whenweseesomethingcloseandsomethingfar,the closerobjectisrelativelylarger.However,withoutbackgroundknowledge,itisjust asreasonabletoassumethatthecloserobjectisactuallylarger.Anglesposethe sameproblem.Ourabilitytoreconstructthethirddimensionisinformedbyyearsof practiceandcollectingabaseofknowledgeaboutthenuancesofvisualperception. Asidefromthedimensionalityproblem,distortionisanotherproblemacomputer can'taccountforbyitself.Distortioncomesfromavarietyofsourcesincluding environmentalvariance(weather,lighting,reections,movements),mechanicalimperfections,niteintegrationtimeonthesensor(motionblur),electricalnoise,and compressionartifactsproceedingimagecapture.Overtheyearscompensationfor theseissueshavebecomesecondnatureandwehavelearnedtotunethemout.For amachine,thebestwaystocompensatearethroughmachinelearningtechniques thatcorrelatehiddenvariableswiththeirvaluesinalabeledtrainingsetorstatistical methodsthatserveasdiscriminators.Ideally,aperfectlyconstrainedcontextwould fundamentallysimplifytheproblem,howeverthatmodelwouldhavelittletonouse intherealworld.Thisiswherepreprocessingbecomesveryimportant(seeSection 5.2).Normalizationisvitaltocomputervision. 9 PAGE 15 3Theoryofneuralcomputation 6 3.1Anintroductiontoneurocomputing Beforediscussingarticialneuralnetworks,thereareseveralprerequisiteconcepts onemustbefamiliarwithinordertounderstandtheeldofneurocomputing.First, itisnecessarytounderstandenoughneurosciencetocomprehendhowcertainneuroscienticmodelsfunctiontogeneratecertaintypesofapproximations,andinwhich eldstheseapproximationsaremoreandlessaccurate.Second,itisnecessarytobe acquaintedwithenoughfundamentalmathandcodingtounderstandtheimplementationofthesemodels.Next,itisnecessarytounderstandenoughcognitivescience tohavesomeideaaboutwhatthebrainissupposedtodo,anditsmechanismsof functionality.Finally,withthisknowledge,itisnecessarytoassumethatit'spossible tomakemeaningfulsimplicationsofsomeaspectsofthehumannervoussystem. 7 Underthisinherentassumption,neurocomputingismostsimplisticallydened asbrain-likecomputation.Therearemanyparadigmsofcomputingsystemorganization;neurocomputingisanattempttobuildcomputersthataredesignedlike thebrain,inhopesofemulatingit.Brainshavebothstrengthsandweaknesses,like anyothercomputationalmechanism.Humanbrainsarenotall-purposecomputing devices;theyarepowerfulatsometasksandweakatothers.Theirpowerresidesin e! ectivebiologicalpreprocessing,theuseofmemoryinplaceofcomputingpower,and highe"ciencywithasmallnumberofoperations.They'readeptatpatternrecognition,motorcontrol,perception,exibleinference,intuition,andeducatedguessing, 6 FromJamesA.Anderson,"AnIntroductiontoNeuralNetworks"[8] 7 Somewouldarguethatneurobiologyisintrinsicallytoonuancedtosimplifymeaningly,orthat wedon'tknowenoughaboutitsmechanismtomakecorrectgeneralizations.Whilethismayactually betrue,withouttheassumptionneuroscienticmodelsarenotsubstantiable. 10 PAGE 16 Figure1:Aneuron howevertheyareslow,imprecise,prejudiced,makeerroneousgeneralizations,andare usuallyincapableofexplainingtheiractions.Alloftheseproperties,desirableornot, mayalsobetypicalofneurocomputing.Becausebiologicalneuralnetsarebounded bybiologicalconstraints,theyprovideawealthofinformationinthepracticalengineeringandeconomicsenseaswellasthecomputational,embodyinganaturally selectedexampleofoptimizationoveravailableresources. 3.2Biologicalneuralnetworks Biologicalneuralnetworksarepopulationsofinterconnectedneuronswhose inputsandsignalingtargetsdenearecognizablecircuit.Theyarethestructure throughwhichourbrainsprocessinformation.Everyintentionalandinstinctual thoughtandmotionisaresultofinteractionsbetweenanetworkofneurons. Ourbrainscontainroughly100billionneurons.Theirstructurecanbeseenin Figure1.Theinputendofneuronsaremadeupofdendrites,whichbranchout inatree-likemannerfromthecellbody,alsoknownasthesoma.Extendingfrom thesomaisalong,thinprojectionknownastheaxonthetransmissionlineofthe 11 PAGE 17 neuron.Theaxoncangiverisetocollateralbranches,formingavast,interconnected network.Whenaxonsreachtheirendtheybranchagainintoastructureknownas theterminalarborization.Attheendsoftheterminalarborizationaresynapses, comprisingtheoutputendoftheneuron.Dendritesreceiveinputsfromothercells, thesomaprocessestheinputsthentransmitsinformationalongtheaxontothe synapses,whichemitoutputtobereceivedbyotherneuronsvianeurotransmitters di! usedacrossthesynapticclefttheemptyregionbetweenthesynapsesofone neuronandthedendritesofanother. Eachneuronisconnectedtothousandsofotherneurons,allconstantlycommunicatingviaelectrochemicalsignals.Itcontinuouslyreceivessignalsfromtheseother neuronsthensumsuptheinputsaccordingtosomeinternalprocess.Iftheresultis greaterthansomethresholdvalue,theneuronresbygeneratingavoltage,knownas anactionpotential,thattransmitsdowntheaxontotheterminalarborization.This responseisanall-or-nothingbinary;eitherthereisanactionpotentialorthereis not.Proceedingring,aneuronhasbothanabsoluteandarelativerefractoryperiod duringwhichitcannotreagain.Fortheformer,thereexistsnoactionthresholdnor subsequentpossibilityofring,butforthelatter,thethresholdismerelyelevated. Thisrelativerefractoryperiod,alsoknownassynapticresistance,isadaptable,which cancausemodiedor"learned"behavioroftheneuroninwhichringfrequencywill riseifastimulusismaintained.Thisphenomena,rstdiscoveredbyDonaldHebb in1949,isknownasHebb'srule:iftheinputofaneuronisrepeatedlyandpersistentlycausingtheneurontore,ametabolicchangeoccursinthesynapseofthat particularinputtoreduceitsresistance. 8 Thisincreasesthee"ciencyofacommon actionpotentialowthroughaneuralnetwork.Thesevarying,modiableresistances giverisetodetailedinteractionsamongthewebofneuronsthatareparamountto thenatureofcomputationthatneuralnetworksperform. 8 FromDonaldHebb,"Theorganizationofbehavior"[18] 12 PAGE 18 Biologicalpressuresonneuralnetworksenforcetwobasicgroundrules:eachhumanbeginswithroughly100billionneuronsandneuronsmustearntheirexistence ordie.Theyaremetabolicallyexpensive,andthusbiologicalpressuredictatesthat thefewestpossiblemustbeused. Therearethreedi! erenttypesofneurons:sensory,motor,andcentral.Sensory neuronsareactivatedbyphysicalstimuliandsendsignalsthatinformthecentral nervoussystemofthestateofthebodyandtheexternalenvironment.Motorneuronsconnectmusclesande! ectororganstothenervoussystemsothattheycanbe controlled.Centralneuronsformalloftheirinputandoutputconnectionswithother neurons. Interactionsbetweenthesetypesofneuronsarewhatcauseourbrainsandbodiestofunction,distinguishinganimalsfromplants.Theyconstantlyandtirelessly performnotonlyourcognitivefunctionsbutalsoautomaticandunconsciousones, suchasresponsetopainstimulusandmaintainingunconsciousbreathing.Everyone ofouractionsandthoughtsisaresultoflearnedinteractionsbetweenneuronsinan incessantlyowingnetworkspanningourentirebody,generatingourperceptionof theworldanddeterminingourbehavior. 3.3Articialneuralnetworks 9 Anarticialneuralnetwork(ANN)isacomputationalmodelthateitherabstractsorisinspiredbythestructuraland/orfunctionalaspectsofbiologicalneural networks.Itisanetworkofsimpleprocessingelementsthatexhibitscomplexglobal behaviordeterminedbyadaptiveweightedconnectionsbetweenprocessingelements andelementparameters.InanANNneuronsarerepresentedasnodes,andsynapses asweightedconnectionsbetweennodes.Nodesarerepresentedbyanarrayofacti9 AdditionalfactsfromChristopherBishop's"PatternRecognitionandMachineLearning"and ai-junkie.com[10][1] 13 PAGE 19 Figure2:Articialneuron vationvalues,oneforeachnode,andweightsarerepresentedbymatriceswithan entryforeachpossiblepairofnodesbetweentwoconnectedlayers.Nodesreceive input,applysomesortofsummationfunction,thenoutputalongadjacentweighted connectionsaccordingtoanactivationfunctionontheweightedsumoftheinputs,as illustratedinFigure2.Theseactivationfunctionscanbebinary,likebiologicalneuralnetworks,orgraded.Thesenodesandconnectionsformanetworkthatlearnsby meansofanalgorithmthatmodiestheweightsbetweennodeshencethestructure ofthenetworkbasedonexternalorinternalinformationtoproduceadesiredsignal ow.Eventually,acertaininputyieldsacertainknownoutput,meaningthenetwork performssomefunction. WhileANNstypicallyabstractkeycharacteristicsofbiologicalneuralnetworks, therearesomewaysinwhichthebiologicalmodelcanbeimprovedbytechnology (sincecomputersarenotconstrainedtobiologicalpressures).Forinstance,computers arecapableofcreatingandmaintainingasmanyneuronsasneeded(althoughthemost complexANNstodateuseontheorderofhundredsorthousandsofnodesrather than100billion).Additionally,computersperformoperationsveryfastandarenever inaccurateorfatigued(thisdoesnotmeanthatanarticialneuralnetworkcan'tbe incorrectorslow,butthateachindividualcomputationcomprisingtheprocesswon't be).Somepuristsrejecttheomissionofmodeledbiologicalpressures,thinkingthem 14 PAGE 20 tobeavitalandirremovablefactorinthecomputingmechanismofbiologicalneural networks,howevermuchmoreoftentheyareignored.Onthewhole,thetrendof modernsoftwareimplementationsofarticialneuralnetworkshasbeenlargelyto abandontheapproachinspiredbybiologyinfavorofamorepracticalapproach basedonstatisticsandsignalprocessing.Inthispracticalapproach,componentsof articialneuralnetworksareusedaspartsofalargersystemutilizingbothadaptive andnon-adaptiveelements. 3.4Typesofproblemsforwhicharticialneuralnetsarebest suited 10 WolpertandMacready's"NoFreeLunch"theorem 11 (1997)statesthatall optimizationalgorithmsareequivalentwhentheirperformanceisaveragedacrossthe entirecandidateproblemset.Whiletheproofofthistheoremismathematically technicalandnon-intuitive,itisgenerallyacceptedasmathematicalfact. Sinceitisknownthatnotalloptimizationalgorithmsperformequallywellon eachproblem,wecaninferthateachindividualalgorithmmustperformbetteron someproblemsthanonothers.Thisisaconsequenceoftheobservationthatthere canbenogloballyoptimalalgorithmforsolvingoptimizationproblemsduetothe assertionthatonaverage,performanceofallalgorithmsacrossallproblemscomes outtobethesame.Therefore,eachalgorithmhasstrengthsandweaknesses,and individualproblemsarebestsolvedwithcertainalgorithmsoverothers. Thissuggeststhatproblemsexistforwhicharticialneuralnetworksshouldprovideahighlye"cientapproach.However,practicalapplicationsforANNshavebeen slowinpresentingthemselves.Incomputers,hardwarelargelydeterminesthesoftwarethatrunse"cientlyonit.Whileeverycomputerisinherentlyaprogrammable 10 FromJamesA.Anderson,"IntroductiontoNeuralNetworks"[8] 11 FromD.H.WolpertandW.G.Macready,"Nofreelunchtheoremsforoptimization"[28] 15 PAGE 21 electronicdevicethatcanperformbinaryarithmeticandlogicfunctions,thereis muchmoretoourmoderndevices.Computersphysicallysuitedtoadaptationand learninghaveonlybeeninexistenceforthelastdecadeorsonecessarilybeingable tohandlebehavioralchange,changesininternalfunctioning,andpotentialinstability. Thisproblemcanbeperceivedasaparalleltothewayinwhichweandchimpanzees di! er.While99%ofourDNAisthesame,the1%di! erencepredominantlylies ingeneswhichmodifysizesandshapesofbrainstructures. 12 Whilewecontainno radicalnewhardwarecomparedtochimps,rearrangements,inations,contractions, andothermodicationsofthesamephysicalstructurehasmadeourbrainscapable ofconsiderablymorecomputationalpowerandexibility. Itisimportantnottolosesightofthisevolutionaryperspectivewhenconsidering ANNs.Articialneuralnetworks,likebrains,cannotcomputeeverything.Their performanceishighlydetaildependent,arrivingatreasonablesolutionsonlyfora limitedandspecicsetofproblems.Thepracticalapplicationsofneuralnetsarethe sameproblemsselectivepressurehascausedanimalstobiologicallyadapttosolve: association,classication,categorization,generalization,rapidresponse,andquick inferencefrominadequatedata. Inthebrain,functionsareperformedcollectivelyandinparallelratherthanwith acleardelineationofsubtaskstowhichvariousneuronsareassigned.Thisisa fundamentaldistinguishingpropertyofarticialneuralnets,andthereasonwhy theyaresousefulfortime-sensitivetasks. ANNsareuniversalclassiersprimarilyusedtomodelcomplexrelationshipsbetweeninputsandoutputsortondpatternsindata.Themostsuccessfulapplications ofneuralnetstodateare:functionapproximationorregressionanalysisincluding timeseriespredictionandmodeling;classicationincludingpatternrecognition,noveltydetection,andsequentialdecisionmaking;anddataprocessingincludingltering, 12 FromJamesA.Anderson,"IntroductiontoNeuralNetworks"[8] 16 PAGE 22 clustering,blindsignalseparation,andcompression. 3.5Whyanarticialneuralnetworkisagoodtforthisproblem Medicaldiagnosisofmelanomarequiresvisualrecognitionofcancerouslesions, whichfallsunderthethemeofclassicationthroughpatternrecognition(andisa problemcurrentlysolvedbybiologicalneuralnetworks). 17 PAGE 23 Figure3:Gradientdescent 4Articialneuralnetworks 4.1Necessaryconcepts 13 Thereareseveralprerequisiteconceptstounderstandingthedi! erencebetween varioustypesofneuralnets. Therstissupervisedandunsupervisedlearning.Forallneuralnets,atraining setofdataisnecessarytoteachthenetworktheconnectionweightsthatgeneratethe desiredow.Sometimes,theappropriateoutputforagivendatasetisknown,and thatinformationcanbeusedtocalculatethenminimizethediscrepancybetweenthe actualandexpectedoutputbymodifyingthenetworkweightstoyieldthedesired ow.Thisprocessiscalledsupervisedlearning.Sometimes,however,theappropriate outputisnotknown,andinsteadthesystemisrequiredtotaketheinputdataand somehoworganizeitinanappropriateandyetunknownway.Methodstosuitthese needsaremuchmoredi" culttoconstructanduse,yetarecapableofproducing insightfulandreliablesolutionstodi"cultproblems.Thisisknownasunsupervised learning.Thisthesisutilizessupervisedlearning. Thesecondconceptisgradientdescent.Gradientdescentisanoptimization 13 FromChristopherBishop,"PatternRecognitionandMachineLearning"[10] 18 PAGE 24 algorithmthatndsalocalminimumofafunction.Ittakesstepsfromastarting pointproportionaltothenegativeofthegradientofthefunctionatthatpointto descendtowardstheminimumasquicklyaspossible(seeFigure3).Sometimes anapproximateinsteadofexactgradientisusedtofurtherspeedtheprocess.Itis importanttonoticethatgradientdescentdoesnotnecessarilyconvergetotheabsolute minimum.Theminimumitconvergestoisbasedonthevalueatwhichtheprocedure starts.Gradientdescentisbasedontheobservationthatifthemultivariablefunction F ( x ) isdenedanddi! erentiableinaneighborhoodofapoint a ,then F ( x ) decreases fastestifonemovesfrom a inthedirectionofthenegativegradientof F at a !" F ( a ) Itfollowsthat,if b = a F ( a ) forstepsize >0asmallenoughnumber,then F ( a ) # F ( b ) .Itisimportanttochoose correctly,becauseitispossibleforgradient descenttooscillatearoundthecorrectsolution.Thisisdemonstratedby F ( x )= x 2 with a =1 and =1 ,whichoscillatesbetween 1 and 1 .Withthisobservation inmind,onestartswithaguess x 0 foralocalminimumofF,andconsidersthe sequence x 0 x 1 x 2 ,... suchthat x n +1 = x n n F ( x n ) ,n # 0 (notethatthe valueofthestepsize isallowedtochangeateveryiteration).Thus,wehave F ( x 0 ) # F ( x 1 ) # F ( x 2 ) # ,whichshowsthatthesequence ( x n ) convergestoa localminimumgiventhat F isboundedbelow. Thelastandpossiblymostimportantconceptfundamentaltoneuralnets(andthe generaleldofmachinelearning)isBayesianprobability.PriortodeningBayesian probability,itisimportanttounderstandthedistinctionbetweenafewterms,namely probability,statistics,andlikelihood.Probabilityandstatisticsarerelatedareas ofmathematicswhichconcernthemselveswithanalyzingtherelativefrequencyof events.Probabilitydealswithpredictingthechancesoffutureevents,whilestatistics involvestheanalysisofthefrequencyofpastevents.Moretechnically,probability isprimarilyatheoreticalbranchofmathematics,whichstudiestheconsequencesof mathematicaldenitions,whilestatisticsisprimarilyanappliedbranchofmath19 PAGE 25 ematics,whichtriestomakesenseofobservationsintherealworld.Asstatedby StevenSkiena,"probabilitytheoryenablesustondtheconsequencesofagivenideal world,whilestatisticaltheoryenablesustomeasuretheextenttowhichourworldis ideal." 14 Thelikelihoodofasetofparametervaluesgivensomeobservedoutcomes isequaltotheprobabilityofthoseobservedoutcomesgiventhoseparametervalues. Itisafunctionoftheparametersofastatisticalmodel,andcannotbeconceivedof beforeanoutcomehasbeenobserved. Ingeneral,statisticsareviewedinthe classical or frequentist interpretation,where theyaredenedintermsofthefrequenciesofrandom,repeatableevents.Incontrast, Bayesian statisticsprovideaquanticationofuncertainty,orameasureofastateof knowledge.It'sanextensionoflogicthatenablesreasoningwithuncertainstatements whoseoutcomescannotberepeatedlymeasured.Theunderlyingprobabilitytheory ofBayesianandclassicalstatisticsisthesame.ToevaluatetheBayesianprobability ofahypothesis,somepriorprobabilitymustrstbespeciedwhichisupdatedin lightofnewrelevantdata. Forinstance,wespeakoftheprobabilityofeventssuchastheearthbeingdestroyedbyanasteroidoraspecicindividualdying.Thesearenotrepeatableevents fromwhichwecancalculateaprobabilitythroughratiosinvolvingfrequencies.However,theystillhaveaprobabilityofoccurrencethatcanbeinferredthrougheducated guessingconsideringrelevantevidence,suchasthefrequencyofasteroidsthathave comecloseandtheimpactithastakentodestroyplanetsofcomparablesize,orthe person'sconsumptionofcarcinogensandparticipationinhazardousactivities.When newevidenceisreceived,itcanbeprocessedandthehypothesisreassessedappropriately.Bayesianprobabilityallowsquanticationsofsuchexpressionsofuncertainty. In1946,physicistRichardCoxshowedthatifdegreesofbelief(suchasthoseused inBayesianprobability)arenumericallyquantied,thenasimplesetofaxiomswhich 14 FromStevenSkiena,"CalculatedBets:Computers,Gambling,andMathematicalModelingto Win"[25] 20 PAGE 26 encodecommon-sensepropertiesofthesebeliefscanbeusedtointuitivelyconstruct asetofrulesformanipulatingdegreesofbeliefthatareequivalenttothesumand productrulesofclassicalstatistics.In2003,statisticaltheoristEdwinThompson JaynesusedCox'saxiomstoprovethatprobabilitytheorycouldbeconceivedasan extensionofBooleanlogictosituationsinvolvinguncertainty.Overtheyears,many mathematicianshaveproposedvarioussetsofaxiomssimilartoCox'sthatallhave behavedpreciselyaccordingtotherulespreviouslyoutlinedforclassicalstatistics. ThecentralideaofBayesianprobabilityisdenedinBayes'theorem: p ( w | D )= p ( D | w ) p ( w ) p ( D ) where D isthesetofobserveddataand w istheeventwhoseuncertaintywe wishtonumericallyexpress.Thisequationexpressestheuncertaintyoftheevent w dependingonthesetofobserveddata D .Theconditionalprobability p ( D | w ) known asthelikelihoodfunctionstandsfortheprobabilityofthegivenobserveddata D iftheevent w isknowntooccur.Thepriorprobabilitydistribution p ( w ) expresses theuncertaintyof w beforethedata D istakenintoaccount,andismostoftena purelysubjectiveassessmentbysomeonewithexperience.Theproductofthesetwo probabilitiesisscaledtoensurethatthefunctionintegratestooneandisthereforea validprobabilitydensity.Inotherwords,theposteriorprobabilityisproportionalto thelikelihoodtimesthepriorprobabilitydistribution,whereallofthesequantities areviewedasfunctionsof w AnadvantageBayesianprobabilityholdsoverclassicalprobabilityisthatthe inclusionofpriorknowledgearisesnaturally.Forinstance,classicalstatisticswould rejectthehypothesisthatacoinisfairifitlandsontails90outof100timesin anobserveddataset,whereasBayesianstatisticscouldincludeapriorprobability distributionreectingasetofobserveddatawherethecoinlandedontails50times of100. 21 PAGE 27 Figure4:Theperceptronfunction Asevidencedabove,theprobabilityBayes'theoremassignsishighlydependent onthepriordistributionused,whichisaconsiderableweakness.Bayesianmethods basedonpoorlychosenpriordistributionscanyieldpoorresultswithhighcondence. Despitethis,Bayesianprinciplesarestillverypopularinpatternrecognitionand othermachinelearningparadigms,astheabilitytoincorporateevidenceinrealtime ishugelyadvantageous. 4.2Theperceptron 15 Theperceptronwastherstinuentialmodelofasimplearticialneuralnetwork,proposedbypsychologistFrankRosenblattin1958.Itisthesimplestkindof feedforward(nocyclesorloops,theconnectionsonlymoveforward)neuralnetwork, functioningasalinearclassier.Itworksonlywhentwopatternsarelinearlyseparable,meaningahyperplanecanbedrawnintheproblemspaceinsuchawaythat completelyseparatesthetwoclasses. Theperceptronfunctionisdenedas: f ( x )= # $ 1 w $ x + b> 0 0 otherwise where x istheinputvector, f ( x ) istheoutputvector, w istheweightvector, 15 FromChristopherBishop,"PatternRecognitionandMachineLearning"[10] 22 PAGE 28 and b isthebias.Thebiasisaconstantvaluethatservestoaltertheposition(not orientation)ofthedecisionboundarysothatthethresholdvalueis0.Thisfunction isabinaryclassierwhere1indicatesapositivecase(ononesideofthehyperplane) and0representsanegative(ontheothersideofthehyperplane). Theperceptron'smethodoflearningisdescribedbytheperceptronlearningalgorithm(Algorithm1onthenextpage).Itusesknowledgeofpastresultstomodifythe weightsoftheconnections,therebyimprovingtheperformanceofthenetwork.This isconceptuallysimilartoBayesianstatistics,inwhichlikelihoodscanbeupdatedto reectnewdata. Notethattheweightsareadaptedimmediatelyforeachpairinsteadofatonce fortheentirevector,andalsothatweightsareonlyupdatedifthereiserror,i.e.only whenlearningwilloccur.Theperceptronconvergencetheoremstatesthatifthere existsanexactsolution,theperceptronlearningalgorithmisguaranteedtolearnthe classicationinanitenumberofsteps.Ifthereisnotanexactsolutionandthe trainingdataisnotlinearlyseparable,thealgorithmwillrunindenitelywithout converging.Unfortunately,manyinterestingclassicationsarenotlinearlyseparable andrequireamorecomplexdecisionsurfacetoseparatetheclasses,soperceptrons donotsu" ce. Theperceptronhasaninputsensorylayercalledaretina,whichispartiallyconnectedtoanassociationlayer.Itisimportanttheoreticallythattheselayersarenot fullyconnected;thisestablishesthateachunitintheassociationlayercomputesa di! erentfunctionoftheimageontheretina.Theassociationlayeristhenreciprocallyconnectedtoaresponselayerbywayofunidirectionallymodiableweighted "synapses"(inthedirectionoftheresponselayer).Thegoaloftheperceptronisthe activationofasingleappropriateunitintheresponselayerforacertainsetofinputs. Theactivationofanyappropriateunitisacandidatesolution. Thebasiccomputingelementoftheperceptronisanarticialneuronknownas 23 PAGE 29 Algorithm1 PerceptronLearningAlgorithm Input: Givenatrainingset D = { ( x 1 ,d 1 ) ,..., ( x n ,d n ) } Output: Trainedmodel y = f ( x ) Algorithm: INITIALIZEtheweightsandthethreshold set w 0 = b set x j, 0 =1 set w i (0)= smallrandomvalues TRAINthenetwork foreachsamplejinD: calculatetheactualoutput y j ( t )= f ( w ( t ) x j )= f ( w 0 ( t ) x j, 0 + ... + w n ( t ) x j,n ) adapttheweights foreachnodeiinn: w i ( t +1)= w i ( t )+ ( d j y j ( t )) x j,i repeatuntiliterationerror d j y j ( t ) < y = f ( x ) istheperceptron'soutputforaninputvector x b isthebias D = { ( x 1 ,d 1 ) ,..., ( x n ,d n ) } isthetrainingsetwhere x j istheinputvector,and x j,i isthevalueofthe i th nodeofthe j th input vector d j isthedesiredoutputvalueof x j w istheweightvector w i isthe i th valueoftheweightvector,tobemultipliedby x j,i w i ( t ) isthe i th weightattime t isthelearningrate,where 0 < % 1 (alearningratesettoohighwill periodicallyoscillatearoundthevalueofthesolution) isauser-speciederrorthreshold 24 PAGE 30 Figure5:Multi-layerperceptron the thresholdlogicunit ,orTLU,comprisedof n inputweightswithstrengths w [ i ] asshowninFigure4.Itsumstheproductoftheinputweightsandtheirrespective strengthsandtransformsthemintobinaryvaluesindicatingactivationifthescaled summedinputisoveracertainthreshold.Thereisnogradedactivation. Althoughitslearningcapabilitiesareextremelylimited,theperceptronprovedto beavaluabletoolinpredictionandexplanationofhumancognition.Interestingly, thelimitationsthatmakeitlesse! ectiveasacomputingdeviceareverysimilarto thoseofhumans,andassuchitwaswidelyusedamongpsychologistsandcognitive scientiststohelpformmanytheoriesregardingtheworkingsofthebrain.Along withtheADALINE(anearlygradientdescentalgorithmusinganon-binaryerror function),theperceptronframedthenetworklearningprobleminawaythatwas fundamentallyacceptedandintegratedintoscienticthought,shapingtheevolution ofcomputationalscienceandfuturenetworkmodelsforyears. 4.3Themulti-layerperceptron 16 Themulti-layerperceptron(MLP)isafeed-forwardarticialneuralnetworkmodel consistingofmultiplefullyconnectedlayersofnodesinadirectedgraphthatmaps setsofinputdataontoasetofappropriateoutput.Itisproventobeauniversal 16 FactsfromJamesA.Anderson's"AnIntroductiontoNeuralNetworks"andDeepLearning.net [8][4] 25 PAGE 31 functionapproximatorbythe universalapproximationtheorem ,otherwiseknownas the Cybenkotheorem MLPscontainmultiplestagesofprocessing;comprisedofaninputlayer,anoutputlayer,andatleastonelatent(hidden)layer(referredtoassuchbecausetheir functionisobscuredfromsight).Thedi! erencebetweenperceptronsandmulti-layer perceptronsliesinthefullconnectednessandtheintermediateunits'useofcontinuoussigmoidalnonlinearitiesasactivationfunctionsinsteadofstepfunctions, not in thenumberoflayers,asthenamesuggests.Thistypeofprocessingmeansthatthe neuralnetworkfunctionisdi! erentiablewithrespecttothenetworkparameters,a factwhichallowsthenetworktobetrainedthroughbackpropagation.Ifthelatent unitsusedlinearactivationfunctions,thenetworkcouldbesimpliedtoaninput, association,andoutputlayer,i.e.aperceptron(becausethecompositionofsuccessive lineartransformationsisitselfalineartransformation). EachnodeofanMLPhasanon-linear(sigmoidal)activationfunction,meantto modelthefrequencyofactionpotentialsofbiologicalneurons,insteadofthesimpliedbinaryfunction.Thetwomostcommonactivationfunctionsarethehyperbolictangentfunction ( y i )=tanh( v i ) withrange ( 1 1) ,andthelogisticfunction ( y i )=(1+ e v i ) 1 withrange (0 1) ,where y i istheoutputofthe i th nodeand v i is theweightedsumoftheinputsynapses. MLPsusebackpropagation,asupervisedlearningtechnique,totrainthenetwork. Backpropagationisane" cienttechniqueforevaluatingthegradientofanerrorfunctionforafeed-forwardneuralnetsothatgradientdescentcanbeused.Itcanbe perceivedasalocal"messagepassing"schemeinwhichinformationissentbothforwardsandbackwardsthroughthenetwork.Backpropagationisdependentonthe activationfunctionbeingdi! erentiable.Learninginthenetworkoccursbymodifying theconnectionweightsbyafactorcalculatedthroughgradientdescentafterapiece ofdataisprocessed,sothatthechangeisbasedontheerrorgeneratedcomparedto 26 PAGE 32 theexpectedresult. Therstphaseofnetworktrainingis[forward]propagation.Thetrainingdatais propagatedforwardthroughthenetwork,generatingoutputactivationsbyapplying theselectedsigmoidalfunctiontotheweightedsumoftheinputs.Theinputvector x isappliedtothenetworkandpropagatedforwardusing a j = % i w ji z i and z i = h ( a i ) where z istheactivation, w istheweight, h isthenonlinearactivationfunction, and a istheweightedsumoftheinputs.Theerror d k isthencalculatedforallthe outputunitsusing d k = y k t k where y k istheactualoutput(alinearcombination oftheinputvariables)ofthe k th unitof x ,and t k isitstargetoutput.Afterthis, backwardpropagationoftheoutputactivationsthroughtheANNoccursusingthe expectedoutputvectorasastartingpoint.Thealgorithmbackpropagatesusing d j = h ( a j ) % k w kj d k toobtaintheerrorforeachlatentunitinthenetwork.The valueof d foraparticularlatentunitcanbeobtainedbypropagatingtheprevious valuesof d backwardsfromunitshigherupinthenetwork.Anexampleofthisprocess canbeseeninFigure10onpage46.Becausewealreadyknowtheerrorvaluesfor theoutputunits,wecanevaluatethemforallofthelatentunitsinafeed-forward networkbyrecursivelyapplyingtheequationabove.Theequation dE dw ij = d j z i isused toevaluatetherequiredderivativeswhere E isthesummederrorfunction, w ij isthe weight, z i istheactivationofunit i and d j istheerror. Thesecondphaseoftrainingisweightupdateusingtheerrorvalues d k calculated inphaseone.Foreachweightedsynapseyoumustrstmultiply d k andtheinput activationtogetthegradientoftheweight,thenupdatetheweightintheopposite directionofthegradientbysubtractingaratioofitfromtheoriginalweight.This ratioisdeterminedbyafactorcalledthelearningrate.Thelearningrateappliesa greaterorlesserportionoftheweightcalculatedbythepastiterationtoeachnew adjustment.Itisanalagoustothestepsizeparameterofgradientdescent.Cheshire 27 PAGE 33 Algorithm2 Multi-layerperceptronlearning Input: Asetoftrainingvectorsandstoppingcriterion(numberofiterationsor desiredaccuracy) Output: Atrainedmodel Algorithm: Initializetheweightsinthenetwork Whilestoppingcriterionnotsatisfied Foreachinputvector x Applythevector x tothenetworkandforwardpropagatethrough thenetworktofindtheactivationsofallthelatentandoutput units a j = % i w ji z i z i = h ( a i ) Evaluatetheerrorsforalltheoutputunits d k = y k t k Backpropagatetheerrorstoobtainerrorsforeachlatentunitin thenetwork d j = h ( a j ) % k w kj d k Evaluatetherequiredderivatives E = 1 2 % k d 2 k dE dw ji = d j z i Updateweightsinthenegativedirectionofthederivatives a j istheweightedsumoftheinputsofunit j z i istheactivationofaunitthatsendsaconnectiontounit j w ji istheweightassociatedwiththatconnection h isthenonlinearactivationfunction d k istheerrorassociatedwiththeoutputunits y k isthenetworkoutputvalue,alinearcombinationofinputvariables t k isavaluespeciedbytheerrorfunction E n isthesummederrorfunction 28 PAGE 34 EngineeringCorporation'swebpage 17 describesalargelearningrateaspunishinga childwithaspankingitishighlye! ective,howeverinappropriateandcounterproductiveiftheo! enseisveryminor.Ifthelearningrateistoohigh,thesystemcan divergeoroscillatearoundthetruesolution,andifit'stoolowconvergencewillbe veryslow.Valueshaverange(0,1].Anoptionalfactorthatcana! ectweightupdate isthemomentumfactor.Themomentumfactorallowsachangetotheweightstopersistforanumberofadjustmentiterations,thussmoothingoutunusualconditionsin thetrainingset.It'susedtopreventthesystemfromconvergingtoalocalminimum orsaddlepoint,intendinginsteadtondtheabsoluteminimum.Ifthemomentumis settoohigh,thesystemcanovershoottheminimumandeliminatenecessaryvariance inthedataset.Ifitissettoolow,thesystemismorelikelytoconvergetoalocal minimum.Valueshaverange[0, & ). Phasesoneandtwoarerepeateduntilthedesiredaccuracyisachieved,orfora setnumberofiterations.Ifthesetnumberofiterationsistoolow,thesystemmight nothaveachancetofullyconverge.Ifit'ssettoohigh,thesystemwillconvergeand runpointlesslyforthespeciciedduration.Thiscanbeavoidedbycombiningahigh numberofiterationswithadesiredaccuracy,sothattrainingterminatesifeitheris reached. Theperformanceofaneuralnetdependssignicantlyonitsnumberofhidden layers.Thehigherthenumberofhiddenlayers,themorenuancethenetworkcan model,howevertoomanylayersoftenleadstoovertraining.Overtrainingisaphenomenonthatoccurswhenthenetworkmodelsitstrainingsettooperfectly,including insignicantdetailsthatdonotaidinclassication.Whenthisoccurs,thenetwork ceasestofunctiononanynewdata. ThemainpracticallimitationtotheperformanceofMLPsisslowconvergencethat isnotguaranteed.Becausegradientdescentmayconvergetoanylocalminimumon 17 Locatedathttp://www.cheshireeng.com/[9] 29 PAGE 35 theerrorsurface(a p dimensionalobjectina p +1 dimensionalspace)depending onthestartingvalues,theMLPmaynotreachtheabsoluteminimum.Despite this,multilayerperceptronsemployingabackpropagationalgorithmarethestandard algorithmforanysupervisedlearningpatternrecognitionprocess.Theyareusefulin researchintermsoftheirabilitytosolveproblemsstochastically,whichoftenallows approximatesolutionsforextremelycomplexproblemstobecalculated. 30 PAGE 36 5Consideredandselectedapproaches 5.1Questionsofintent Priortoanysoftwareundertaking,itisnecessarytodetermineanswerstoa fewkeyquestionsthatwillshapetheapproachtothesolution.Eachchosendirection leadstoanewsetofquestions,creatingacandidatemapalongthemanyroadsto resolution. Inregardtothisthesis,therstsuchdecisionwaswhichapproachshouldbe takentowardstheproblemofmedicalimagerecognitionandwhatthefocusshould be.Thereweremanypossibledirections:afocusonresearch,afocusoncomparative performancesofmethods,afocusonafunctionalproduct,afocusonanoptimized product,oranycombinationofthese.Eachapproachhaditsmeritsanddisadvantages.Forinstance,whileresearchingdi! erentalgorithmicapproachesandcomparing theirpastperformancesinotherstudieswouldsupplementgeneralknowledgeofalgorithmsandmachinelearning,itwouldn'timproveprogrammingskills.Contrastingly, generatingafunctionalproductwouldprovideaplethoraofpracticalexperience,but thesubsequentknowledgeofcomputervisionandmachinelearningtechniqueswould beselectivelydeepratherthanbroad.Intheend,practicalknowledgetrumpedtheoreticalfortworeasons:anishedproductisbenecialasadisplayofabilitiesfor futureemploymentopportunities,andalsoasapotentialdiagnostictoolforaserious disease. Thisverdictledtoanotherfork:whethertowritetheprogramfromscratchor makeuseofopensourcecode.Programmingfromthegroundupcouldresultina uniquelibrarygearedtowardscomputervisionandmachinelearningaswellasavery thoroughunderstandingofeachintricacyoftheprocess;concludingwithawealth ofexperienceinalgorithmsandsoftwaredevelopment.However,creatingastrongly 31 PAGE 37 associativeclassierfordataascomplexasskinlesionimageswouldpresumablyrequiremorethantwosemesters.Theotheroption,utilizingpreexistingsoftwareand librariesasbuildingblocksandexpandingontheproductsofthosewhohavesuccessfullyconqueredaspectsoftheproblem,actuallyposesthesamedesignchallengesas independentcodingandrequiresacomparablelevelofcomprehension,whileatthe sametimebeingmuchmoreapplicabletotheeldofpracticalsoftwaredevelopment. Inacompanyvyingforprot,timeisvaluableandnoneiswastedrewritingexistingcode.Thetimesavedcanbeputtouserevisingandoptimizing,ortobeinga stepaheadofthecompetition.Notonlyisreusingcodehighlyadvantageousinthis situationasingleprogrammerwithacomplexproblembutemployersaremore likelytobeimpressedbyresultsthandetermination.Additionally,identicationof melanomaisarelevantproblemandtheproposedsoftwarecouldhavepresentand propitiousapplications.Obviously,itisbothmoreresponsibleandreasonableto followtherouteoffunctionalityandmarketability. Withthatsettled,thenexttaskwouldusuallybetodeterminethedesiredalgorithmicapproach.Inthiscase,theoriginalintentwastoexplorearticialneural networks,sothiswasnotanissue.Asstatedpreviously,articialneuralnetworks functionverywellonclassicationproblems.However,otherclassication-oriented machinelearningalgorithmsexist;amongthemarequadraticclassiers,decision trees,Bayesiannetworks,andhiddenMarkovmodels. 5.2Preprocessingtechniques Whileneuralnetworksareexibleintheirfunctionality,passingthemacomplete,unalteredimageisunlikelytobethebestoption.Typically,preprocessing techniquesareusedtoconverttheimageintoadatarepresentationofitskeyfeatures thatispassedtothenetwork.Therearecountlesswaysofapproachingthisproblem. Regardlessofmethod,therststepofpreprocessinganimageisnormalization. 32 PAGE 38 Theimagemustbecenteredandscaledtoanappropriatesizeandlevelofdetail (ideallyuniformlyrelativeacrossthedataset)whileretaininginformationabout theabsolutesize,lighting,andcolor.Usually,thisinvolvesrstcalculatingdata thattheprocessofnormalizationalters,thencroppingandmagnifyingtheimageto facilitatefeatureextraction.Thenextstepistoselectthefeaturesyouwishtoextract. Therearemanyestablishedparadigmsoffeatureextractionwithintheeldofmedical imagerecognition;thesecanbeusedaseithersetrulesorguidelines.Afterselecting thefeatures,theymustbeextracted.Therearevariouswaysofrepresentingthe extracteddatatothenetwork.Themostcommonmethodisabinaryvectorwhere eachinputrepresentseitherthepresenceorabsenceofafeature.Whileassigning thesenuancedvariablestoabinarydoesnotaccountforthevariabilityoffeature representationandmaydiminishaccuracyforasmallsetofselectivefeatures,most successfulmethodsemployalargesetofselectors;thusreducingtheamountofloss. Also,binaryresponsesareinherentlycharacteristicofneuralnetworks,andtheyare excellentclassiersinspiteoftheapparentlackofnuance. 5.3Selectionoffeatures 18 Beforefeatureextractioncanoccur,aschemaoffeatureselectionmustbe determined.Featuresselectedforextractionmustbequantiablymeasurableand representative,withbothahighsensitivityandspecicity.Manyestablishedschema forfeatureselectionofskinlesionimagesalreadyexist.Mostoftheseschemanumericallyassignvaluesforcertaindermoscopiccriteria,fallingintothecategoriesoflocal andglobalfeatures.Globalfeaturesallowanexpeditivepreliminarycategorization ofaskinlesion,followingwhichlocalfeaturesareusedtoassessthelesionwithdetail.Localfeaturescanbethoughtofasthe"lettersofthedermoscopicalphabet." Theyconsistofpigmentnetworks,dotsandglobules,streaks,ablue-whitishveil,pig18 Fromdermoscopy.org[5] 33 PAGE 39 mentation,hypopigmentation,regressionstructures,andvascularstructures.Global features,composedbylocalones,arereticular,globular,cobblestone,homogeneous, starburst,parallel,multicomponent,lacunar,andunspecicpatterns. Pigmentnetworksresembleathingridsuperimposedoverthelesion'sbackground. Dotsandglobulesarethenamesgiventoirregularroundedstructuresinvarying colorsandsizes.Streaksarelinearstructuresofvaryingthickness,whichmaybe observedthroughoutthelesionbutarenotclearlycombinedwithpigmentnetwork lines.Ablue-whitishveilisahazyanddi! usepigmentationoverlaidonotherstructures.Pigmentationreferstodarkareaswhichprecluderecognitionofmoresubtle dermoscopicfeatures.Hypopigmentationreferstoanareaofdecreasedpigmentation withinanotherwiseordinarypigmentedlesion.Regressionstructuresareareasof white,blue,orbothresemblingeitherascar,orspeckledpeppering.Theyaresometimesmistakenforablue-greyveilwhenthepepperingisnotthoroughlydistinctive. Thereareseventypesofvascularstructures:commavessels,wreathvessels,arborizingvessels,hairpinvessels,dottedvessels,linearirregularvessels,andvesselswithin regressionstructures.Theybecomemorereadilyapparentwhenthetumorisvery lightlycompressed. Reticularpatternsarethemostcommoninmelanocyticlesions,characterizedbya pigmentnetworkcoveringmostofthesurfaceofthelesion.Globularpatternsconsist ofnumerousirregularlydistributedglobules.Cobblestonepatternsareverysimilar, buttheglobulesarelargerandcloselyaggregatedresemblingcobblestones.Homogeneouspatternsappearintheabsenceofotherfeaturesasdi! usepigmentationranging frombrowntoblue-greytoreddish-black.Starburstpatternsappearattheedgesof skinlesionsandarecomprisedofpigmentedstreaksinaradialarrangement.Parallel patternsareexclusivetolesionsinglabrous(hairless)regionsandarecharacterized byaseriesofparallelpigmentedstreaks.Multicomponentpatternisthenamegiven tocombinationsofthreeormoredistinctivedermoscopicstructuresinagivenregion. 34 PAGE 40 Lacunarpatternsareaggregationsofredlacunasroundedstructureswithsmooth bordersineitherred,blue-purple,orblack.Whenastructurecannotbecategorized intoanyoftheaforementionedpatterns,itisdeemed"unspecic." Inadditiontolocalandglobalfeatures,asymmetry,border,andcoloraresignicantdermoscopiccriteria. Threeofthemostwell-establishedandbenchmarkedmethodsforfeatureselection aretheABCDrule,Menzies'method,andtheseven-pointchecklist. 5.3.1ABCDrule(Stolzmethod) 19 TheABCDmethodisanalgorithmicanalogueoftheheuristicusedbydermatologiststodistinguishmelanomafrommelanocyticlesions.Liketheheuristic,it isanacronyminwhichtheAstandsforasymmetry,theBforborder,Cforcolor, andDfordi! erentialstructures.ItwasdevelopedbyW.Stolzandco.in1994, anditsreliabilitywasprovenbyF.Nachbaretal.thesameyear.Twoindependent dermatopathologistsexaminedandscoredbyhand172melanocyticlesionsconsistingof69melanomasand103melanocyticnevi,yieldingasensitivityof92.8%anda specicityof90.3%.Themethodiseasilylearnedandrapidlycalculated,providing areliable,objective,andreproduciblepreliminarydiagnosisofmelanoma. TheprocedureofStolz'smethodistosemiquantitativelyassesseachofthefour aforementionedcategoriesandassignthemanumericscore.Eachscoreisscaledby aweightfactorindicatingempiricallydeterminedimportance,thenthesumofthese weightedscoresyieldsthetotaldermatoscopyscore,orTDS.Multivariateanalysis yieldedtheweights TDS =1 3 A +0 1 B +0 5 C +0 5 D 19 FactsfromW.StolzandF.Nachbar,"TheABCDruleofdermatoscopy:Highprospectivevalue inthediagnosisofdoubtfulmelanocyticlesions"[21] 35 PAGE 41 Imageobtainedfromdermoscopy.org[5] Figure6:Stolz'sABCDruleforidentifyingmelanoma ATDSoflessthan4.75isconsideredbenign,between4.8and5.45suspicious,and greaterthan5.45highlysuspicious. Asymmetryiscalculatedbybisectingthelesionwithtwooptimallypositioned perpendicularaxes.Nopointsareawardedifthelesionissymmetricoverbothaxes, oneisawardedifitissymmetricoveroneaxis,andtwoareawardedifneitheraxis yieldssymmetry.Symmetryisassessedconsideringcolorsandstructuresinaddition toshape.At54%oftheTDS,thisisthemosthighlyweightedcriterion;themajority ofmelanomasreceiveascoreoftwocomparedtoonly25%ofbenignlesions. Theborderscorerangesfromzerotoeight,whereeachpointrepresentsasharp anddistinctcut-o!inaquarterradian(45 )"piepiece."Thisistheweakestcriterion, accountingforonly4%oftheTDS.Themajorityofmelanomasscorebetweenthree andeightpoints,whereasmostbenignlesionsscorebelowthree. Onecolorpointisawardedforeachcolorpresentinthelesion.Thepalateconsists oflightbrown,darkbrown,red,blue-grey,black,andwhite(lighterthanadjacent skin).Thusthescorerangesfromonetosix,wheremostmelanomasscorethreeor more,and40%scorevetosix.Coloraccountsfor21%oftheTDS. 36 PAGE 42 ThenalcriterionofStolz'smethodisthepresenceofdi! erentialstructures.The vecandidatesareapigmentnetwork,streaks(twoormore),dots(twoormore), globules(oneormore),andstructurelessorhomogeneousareas(covering10%or moreofthelesion).Apointisawardedforeachclearlyvisiblestructure,ranging fromonetove.Likecolor,di! erentialstructuresaccountfor21%oftheTDS. 5.3.2Menzies'method 20 Menzies'method,formulatedbyMenziesin1996,isasimpleequation, M = p n where M isMenzies'score, p standsforthenumberofpositivefeatures,and n stands forthenumberofnegative.Thehigherthescore,thegreaterthechanceofmelanoma. Thesetofnegativefeaturesconsistofsymmetryandasingleevencolor.Thesetof positivefeaturesconsistofablue-whitishveil,browndots,pseudopods,radicalsteaming,scar-likedepigmentation,vetosixcolors,blueorgreydots,andabroadened network.Possiblescoresrangefromnegativetwotoeight. Menzies'methodisnotaverystrongclassierandisonlyusedasaverypreliminarytestindicatingwhetherfurtherdiagnosticactionshouldbetaken. 5.3.3Seven-pointchecklist 21 ProposedbyG.Argenzianoin1998,theseven-pointchecklistisalistof sevenfeaturesselectedfortheirhighcorrelationwithinstancesofmelanoma.Three featuresareconsidered"major,"receivingtwopoints,andfourareconsidered"minor," receivingone.Thescoresforeachfeaturearesummed,andifthescoreisgreater 20 Fromdermoscopy.org[5] 21 Argenzianoet.al,"Epiluminescencemicroscopyforthediagnosisofdoubtfulmelanocyticskin lesions.ComparisonoftheABCDruleofdermatoscopyandanew7-pointchecklistbasedonpattern analysis."[14] 37 PAGE 43 thanthreethelikelihoodofmelanomaishigh.Argenzianofoundthatusingthis method,dermatologistswereabletoclassifymelanomawithasensitivityof95%and aspecicityof75%,correctlydiagnosing82%ofmelanocyticlesions. Themajorcriteriaconsistofanatypicalpigmentnetwork,ablue-whitishveil,and anatypicalvascularpattern.Theminorcriteriaconsistofirregularstreaks,irregular pigmentation,irregulardotsand/orglobules,andregressionstructures. 5.3.4Formulatedmethodusedinthisthesis 22 Obviously,thepointofanapproachutilizinganarticialneuralnetisto establishanewmethodfordiagnosingmelanoma.Theseapproacheswereuseful asaguidelineandhelpfulinsuggestingbenchmarksagainstwhichtocomparethe network'sperformance.Unliketheseapproaches,however,theextractionprocessis performedbyamachineinsteadofatrainedprofessionalhumaneye,whichdrastically decreasesitsaccuracy.Preprocessingextractsinformationaboutfeaturespresentin thelesion,afterwhichthenetworkisexpectedtoreachthesameconclusionaboutthe representativenessofcertainfeaturesoverothers,ifnotthesameoverallperformance. Aftercarefulconsiderationofsomestronglycorrelatedfeaturesandalgorithms throughwhichtoextracteach,itwasdeterminedthatthefeaturevectorshould includeveentriesrepresentingstrongcornerspresentintheedge,veentriesrepresentingthepresenceofdi! erentcolorsonthesurface,andonerepresentingthe presenceofdi! erentialfeatures.Thereasoningisthatthisapproach,whileunique, isderivativeofStolz'sABCDmethod. TherstpartofthetechniqueistakendirectlyfromStolz'smethod.Avectoris returnedindicatingwitheitheraoneorazeroifthecolorsbrown,black,white,blue, and/orredarepresent.First,alterisappliedtotheimagethatturnsanypixel 22 FactsaboutalgorithmsfromGaryBradskiandAdrianKaehler's"LearningOpenCV:Computer VisionwiththeOpenCVLibrary."[11] 38 PAGE 44 Figure7:Colordetection Figure8:Cornerdetection whiteifitsRGBvalueisnotwithintherangeofthespeciedcolorandblackifitis. Thenadetectionwindowisloopedovertheresultantimageuntilaregionisfound inwhichtheratioofrepresentedpixelstototalpixelsisaboveacertainthreshold. Thepresenceofstrongcornersindicatesboththatthelesionisasymmetric,and thattheedgeissharp.First,theimageisrunthroughabilaterallterwhichreduces noiseandsmoothstheimagewhilepreservingedges.Thelterfunctionsbyreplacingtheintensityvalueofeachpixelinanimagewithaweightedaverageofintensity valuesfromadjacentpixelsinadenedregion.Itpreservesedgesbycalculatingthe weightsusingspeciableradiometric(colorintensity)di! erencesaswellasEuclidean distance.Becauseofthetypicalstrongradiometricdi! erencebetweenalesionand thesurroundingskinandtheweakerdi! erencebetweenstructureswithinalesion,a bilaterallterwitharelativelylargetoleranceforradiometricchangesandamedium sizedwindowreturnsanimageofalesionwhosestructuresareobscuredbutwhose edgesareclearlydiscernible.Afterthis,acornerdetectionalgorithm(OpenCV's 39 PAGE 45 Figure9:Structuredetection GoodFeaturesToTrack())isappliedtotheimage,returningcornerswithbigeigenvaluesascomparedtoaspeciedqualityleveltimesthemaximumeigenvalueina denedregion.Finally,avectorisreturnedwithuptoveonesindicatinghowmany cornersareabovetheprovidedthreshold. Thelastpartofthetechniquedetermineswhetherornotdi! erentialstructures arepresentinthelesion.First,theblurredimagefromthecornerdetectionstepis passedthroughaCannyedgedetectorthattracesaroundthelesion.Next,anellipse isttedaroundtheoutlineoftheimageandtheboundingrectangleoftheellipse isreturned.Afterthat,theimageisconvertedtogreyscaleandthecontrastofthe imageisdoubled.Featuredetectionoccurswhenanalgorithmisrunontheregion insidetheaforementionedboundingrectanglethatdetectsiftheradiometricvariance withinthelesionisaboveacertainthreshold.Thisisbasedontheobservationthat di! erentialstructuresmanifestthemselvesvisuallyasareaswithdi! erentcoloring thanthesurroundingregion. Thisprocesscoversthebasesofasymmetry,border,color,anddi! erentialstructures,justlikeStolz'smethod.Eachimageinthedatasetispassedthroughthis programandreducedtoan11pointfeaturevector,witha12thfeaturecalledthe biasindicatingwhetherornotthetrainingvectorrepresentsamelanoma.Thesetof featurevectorsisthenpassedtothenetworktolearn. 40 PAGE 46 5.4Practicallimitations Unfortunately,classierprogramsarepracticallyboundedbycertainlimitations,thebiggestofwhichareconnectionstolocaldoctors,nancialcapital,computingpower,andman-hours.Theseconstraintslimitpracticableimageacquisition techniques,a! ordablesoftware,andgeneralscopeoftheproject.Withnoconnectionsornancialcapital,limitedcomputingpower,andonlyoneprogrammer,this projectwasseriouslyconstrainedincomparisontootherstudies. Therstproblemwastheacquisitionofatrainingsetofimages.Mostofthe publishedstudiesformachineclassicationofmelanomaobtainedimagesfromalocal dermatologist.Thisoptionwasunavailableforthisprojectsoinsteadfreeonline databaseswereused.Therewereonlytwoapparentfreeoptions 23 .Betweenthe twoofthesesites,101imagesofmelanomaand129benignmelanocyticlesionswere obtainedconsistingverypredominantlyofdysplasticnevi.Sincedatasetsmustbe splitintosubsetsfortraining,testing,andvalidation,thatleftalimitedsupplywhere melanomawerevastlyoverrepresented(inreality,lessthan3%ofmelanocyticlesions aremelanomas,asopposedtothewhopping44%inthisdataset).Additionally,the datarepresentsaverystrongselectionbiastowardsdysplasticnevi,whichstrongly resemblemelanoma,becauseveryfewpeoplegotothedermatologisttoexaminea commonnevus.Biasesintheinputdatareectinneuralnets'classications. Duetotimeconstraintsandhavingonlyasingleprogrammer,itwasimperativeto makeuseofsoftwarethathadalreadybeenwrittenratherthancustomizingalongeach facet.Additionally,duetolackoffunding,noprofessionalsoftwarecouldbeused,only freeware.Ofthisfreeware,someverypromisinglibrariesrecentlybecameinactivedue toconictingretrogradedependencieswithhostprograms,andsomedidnothavea 23 Fromdermquest.organdvase.essex.ac.uk/software/jasmine/moles.html[2][22] 41 PAGE 47 wrapperforafamiliarlanguage.Luckily,opensourcesoftwareisincreasinglypopular, sosu" cientcomputervisionandarticialneuralnetworklibrarieswerenotdi" cult tond.(SeeSection5.5) Perhapsmostimportantly,thepreprocessingtechniqueislimitedtoallowthe programtoruninareasonableamountoftime.Ideally,theparametersateach stepoftheprocesswouldbeindividualizedforeachimage.However,automation ofthetailoringprocessisinandofitselfawholeeldofmachinelearning.This meansthateachimagemustbeprocessedwithauniformsetofvaluesthatmaywork verywellorverypoorly,dependingontheimage(butonaverageworkacceptably). Thisuncertaintyispervasiveoftheentireextractionprocess.Forinstance,inthe blurringstep,ahightoleranceforradiometricchangesmaycompletelyobscurea lesioncomposedofmanyglobuleswithsoftedges.Thecolordetectionalgorithmis capableofmissinganentireregionofacertaincolorduetounluckywindowplacement whereateachstep,thepresenceratioisjustbelowthethreshold.Andincreasingthe contrastinthelaststepcanobscurestructureswhosepixelintensitywassimilarto thebackgroundofthelesion's.Unfortunately,theconstraintsofthisprojectallow neitherthetimenorthecomputingpowertosolvetheseproblems,sotheproject mustrelyonthestrengthoftheselectedfeaturesaswellasthepoweroftheneural nettomitigatethem. Despitethesenumerouslimitations,articialneuralnetworksremainstrongand exibleclassiers,anditremainsfeasibletoproduceahigh-functioningandstronglycorrelatednetworkthatclassiesmelanoma. 42 PAGE 48 5.5Software 5.5.1Computervision OpenCV 24 isanopensourcecomputervisionlibrarywritteninoptimizedC.Its goalisto"provideasimple-to-usecomputervisioninfrastructurethathelpspeoplebuildfairlysophisticatedvisionapplicationsquickly."Itincludes500+functions coveringfactoryproductinspection,medicalimaging,security,userinterfacing,cameracalibration,stereovision,robotics,andmore;containingafullmachinelearning library(MLL)focusedonstatisticalpatternrecognitionandclustering. 5.5.2Articialneuralnetwork TheneuralnetworkusedisamodicationofNeilSchemenauer'sPythonmodule bpnn.py,whichheplacedinthepublicdomainthroughhispersonalwebsite. 25 Itisa simpleandcustomizablemulti-layerperceptronutilizingbackpropagation.Itusesthe logisticfunction f ( x )= 1 1+ e x forthehiddenandoutputactivations.Outputactivationsareequivalenttotheclassication,sincetheyrepresentthepercentlikelihood ofapositivecase. Anetworkisinitiatedwiththethenumberofnodesintheinput,hidden,and outputlayer.Intheinputlayer,thenumberofnodescorrespondstothenumber offeaturesforeachinput,andintheoutputlayeritcorrespondstothenumberof outputvaluesforeachinput.Inthismodel,thereisonlyonehiddenlayerwitha customizablenumberofnodes,andindividualweightedconnectionsbetweeneach nodeintheinputandhiddenlayerandhiddenandoutputlayerwithanon-linear activationfunction.Thiscreatesaowfunctionallysimilartoanetworkofmultiple hiddenlayerswithfewernodesperlayer,howeverwithlesscomputation.Thereare 24 FromGaryBradskiandAdrianKaehler"LearningOpenCV:ComputerVisionwiththeOpenCV Library"[11] 25 Locatedathttp://arctrix.com/nas/[24] 43 PAGE 49 stillmultiplelevelsofnon-linearity(atthehiddenandoutputlevel),whichisnuanced enoughtomodelmostphenomena. Next,thetrainingstepiscalledwiththelistofinputsasamandatoryparameter andnumberofiterations,learningrate,andmomentumfactorasoptionalparameters defaultedto5000,0.5,and0.9respectively.Thelistofinputsisformattedsuchthat eachentryinthelistisalistcontainingbothavectorofinputvalues,andavector ofoutputvalues.Forexample, XOR=[[[0,0],[0]], [[0,1],[1]], [[1,0],[1]], [[1,1],[0]]] encodestheXOR(exclusiveOR)functionwhichreturnsapositiveifeitheroneof theentriesispositive,butnotboth. Thenetworkistrainedoneinputvectoratatime.Therststepoftraining foreachvectorisforwardpropagation,whichconsistsofcalculatingthehiddenand outputactivations.Thisisdoneusingthelogisticfunctionwith x equaltothesum ofalloftheinputstothenodescaledbytheircorrespondingweight.Thenextstep oftrainingisbackpropagation.First,theerrortermsforeachnodeintheoutput layerarecalculatedas a (1 a )( t a ) where a isthenode'sactivationand t isthe targetvalue.Next,theerrortermsforeachnodeinthehiddenlayerarecalculated as a (1 a ) # where a isthenode'sactivationand # isthesumofeachnodeofthe outputlayer'serrortermsscaledbyacorrespondingweight.Afterthiscalculation, theweightsbetweenthehiddenandoutputlayeraremodied,followedbytheweights betweentheinputandhiddenlayer.Sincethelayersarefullyconnected,theweights arestoredinmatriceswitharowforeachnodeintherstlayerandacolumnfor 44 PAGE 50 eachnodeinthesecond,meaningthereisanentryforeverypairofnodes.Foreach weightbetweenthelayers,avalueisadded(itisnotalwayspositive).Thevalueis Nea + M where N isthelearningrate, e istheerrortermoftheparticularoutput node, a istheactivationoftheparticularhiddennode, M isthemomentumfactor, and isthevalueof # a fromthelastadjustmentcycle.Aftertheprovidednumber ofiterations,thenetworkistrained. ReturningtotheexampleoftheXORfunction,Figure10onthenextpageshows thestepsofnetworktrainingfortherstiterationoftherstrow,[[0,0],[0]].Forthe sakeofconvenience,theweightmatricesareinitializedwithallentries0.5insteadof atrandom.Thenumberofnodesinthehiddenlayeristwo. 45 PAGE 51 Figure10:FirststepofXORfunctionnetworktrainingforrstinput 46 PAGE 52 6Finalproduct 6.1Results 6.1.1Projectresults Despitenumerouslimitations,theprojectstillachievedconsiderablesuccess.Its performancecanbeseeninFigure6.1.1. 26 After40trialrunschartingperformance overvarioussettings,itwasdeterminedthatonaverage,aneuralnettrainedwith idealparametervaluesisabletocorrectlyclassifylesions64.48%ofthetime,with averagesensitivity53.22%andaveragespecicity72.82%.Trainingwasperformed witharandom80%ofthelesionsetandtestedwithitscomplement. Theidealparametersare11nodesintheinputlayer,eightnodesinitshidden layer,onenodeintheoutputlayer,alearningrateof0.5,amomentumfactorof 0.9,and5000iterations.Overavarietyofsettings,performancerangedfrom44-78% accuracy,12-86%sensitivity,and54-100%specicity.Theseresultswereinvariably moreaccuratethanrandomguessing(whichwouldyield44%accuracy,with101 melanomaand129benignmelanocyticlesions).Especiallyconsideringthestrong selectionbiasofthedatasettowardsdysplasticnevi,theseresultsareimpressive,if lessthanideal.Everylesionthenetworktrainedonwassuspectedtobeamelanoma byadermatologist.Unfortunately,despiteconsiderablesearching,nostandardized datasetforbenchmarkingperformanceonthisproblemwasmentionedorfound. Therstparametertestedwasthenumberofnodesinthehiddenlayer.For thisstep,thelearningratewasheldconstantat0.5,themomentumat0.9,and thenumberofiterationsat5000.Withasmallnumberofnodes(four),results 26 Accuracyis[truepositives+truenegatives]/[truepositives+falsepositives+truenegatives +falsenegatives],sensitivityis[truepositives]/[truepositives+falsenegatives],specicityis[true negatives]/[truenegatives+falsepositives],positivepredictivepoweris[truepositives]/[truepositives+falsepositives],andnegativepredictivepoweris[truenegatives]/[truenegatives+false negatives] 47 PAGE 53 Figure11:Results 48 PAGE 54 variedwidelydependingontheinitialdistributionofthedata.Oversixruns,total accuracyrangedfrom51-78%,withspecicitiesbetween24-86%andsensitivities between55-70%.Thesenetworkswerenotabletoreachaconsistentlyappropriate levelofnuancewithsofewnodesthroughwhichtocreateafunctionow,although itisworthnotingthatthenetworkwiththebestperformancewastrainedwiththese conditions.Sixnodesperformedstronger,howeverweresimilarlyinconsistentand hadverylowsensitivities(45%onaverage).Eightnodesprovedtobemuchmuch moreconsistent,withaccuracyhangingtightlyaround64%,sensitivityaround53%, andspecicityaround73%.Attennodes,thenetworkbegantoexhibitsignsof overtraining,decliningto61%averageaccuracy,52%averagesensitivity,and73% averagespecicity. Theidealnumberofiterationswaseasytodetermineandrequiredlittletesting. Althoughtheerroralmostinvariablycontinuestodecreasewitheachiteration,it isatanincreasinglyslowrate.Aftertherstseveralhundred,improvementdrops considerably;afterseveralthousanditisontheorderofhundredthsofapercentevery hundrediterations.Fivethousanditerationsraninareasonableamountoftimeand producedstrongresultsnotsignicantlyweakerthan8000. Finally,thelearningrateandmomentumfactorweretested.Whileahigher momentumyieldedhigheraverageaccuracy(68%),averagesensitivitydropped(47%). Increasingthelearningrateincreasedthespecicity(79%),howeverfortheproblemof melanomaidentication,specicityistheleastimportantvalue.Withtheconcurrent dropsinaccuracy(60%)andsensitivity(37%),novariationperformedbetterthan thestandardrespectivevaluesof0.5and0.9. 49 PAGE 55 6.1.2Similarprojectresults Teamsofdoctorsandscientistshaveconductedsimilarprojectswithhigheryet comparablelevelsofsuccess. Inchronologicalorder,therstsuchprojectisFikretErcalet.al's,"Neural networkdiagnosisofmalignantmelanomafromcolorimages"[13].Theirprojectwas remarkablysimilartothisone,exceptthatErcalet.alfocusedmainlyoncolordata (asymmetryandborderdatawereincludedaswell).Ercalet.alusedadatasetof 240lesionscomprisedof120nevi(40ofwhichweredysplastic)and120melanoma. Theirbestresultscamefrom14inputfeatures,sevenhiddenneurons,andoneoutput classication.Theiraverageaccuracywas76.8%,averagesensitivitywas80%,and averagespecicitywas86.3%. Thenextproject,StephanDreiseitlet.al's,"Comparisonofmachinelearning methodsforthediagnosisofpigmentedskinlesions"[15],trainedanarticialneuralnetworkon107-featurevectorscalculatedfrom1619lesions(comprisedof1290 commonnevi,224dysplasticnevi,and105melanoma).Theiraverageaccuracywas 82.88%,averagesensitivity76.86%,andaveragespecicity73.95%. NextisYiDinget.al's,"Acomputerassisteddiagnosissystemformalignant melanomausing3Dskinsurfacetexturefeaturesandarticialneuralnetwork"[16]. YiDinget.alusedonly46lesions;34benignlesionsand12melanomas.Their goalwastocomparetheperformanceofnetworkstrainedon2Dand3Dfeature informationextractedfromthesameimages.Bothnetworkshadasensitivityof 91.7%,howeverthenetworktrainedontwodimensionalfeaturedatahadaspecicity ofonly25.7%comparedto76.4%forthenetworktrainedonthreedimensionalfeature data(accuracystatisticswerenotavailable). IliasMaglogiannisandCharlamposN.Doukas',"OverviewofAdvancedComputer VisionSystemsforSkinLesionsCharacterization"[20]usedadatasetof3707lesions, 50 PAGE 56 comprisedof2598commonnevi,1014dysplasticnevi,and95melanoma.They comparedperformanceofdi! erentalgorithmsontheseimagesusinga31-pointfeature vectorcomprisedofdetailedborder,color,andtexturalfeatures.Theirarticial neuralnetworkranwithalearningrateof0.3andnomomentumfor500iterations. Theresultwasanaverageaccuracyof87%,averagesensitivityof81%,andaverage specicityof86.7%. Mostrecently,PabloG.CavalcantiandJacobScharcanski's,"Automatedprescreeningofpigmentedskinlesionsusingstandardcameras"[12]usedaneuralnetworktrainedondetailedfeaturevectorsmodelingtheABCDrule.Theytrained on152lesions(45neviand107melanoma)andyieldedanaccuracyof75.66%,a sensitivityof75.24%,andaspecicityof80%. Oftheseprofessionalprojects,theaverageaccuracyis80.59%,averagesensitivity 80.96%,andaveragespecicity80.67%.Thefactthattheaveragevaluesofthis project(64.48%,53.22%,and72.82%respectively)arenotashighcanbeprimarily attributedtosmallerand/orlessrepresentativesamplesizesandalessdetailedfeature vector.Thesinglebiggestadvantageoftheseprofessionalprojectsisthattheirdata setscontainedamuchlowerproportionofdysplasticnevitototalbenignlesions.It isworthnotingthatatbest,thisprojectperformedwithrespectivevaluesof77.50%, 85.71%,and68.42%.Givenanidealdatadistribution,thisprojectiscompetitive withsimilaronesofprofessionalcaliber. 6.2Possibleimprovements Althoughtheprojectwassuccessful,thereisalwaysroomforimprovement.Most simply,theperformanceoftheneuralnetwouldbegreatlyimprovedbyalarger,more varied,andmorerepresentativedataset.Theinclusionofcommonnevialongwith 51 PAGE 57 Figure12:TheHaarwavelet dysplasticwoulddrasticallyincreasespecicity,andhavingalowerratioofmelanoma tobenignlesionwouldrendertheclassicationmoreapplicabletotherealworldin whichmelanomasmakeupaverysmallpercentageofalllesions.Ideallythereshould existastandardizedandoptimizeddatabaseonwhichtotrainandtestmachine learningtechniques,sothattheperformanceofallapproachescanbecompared,and thesuccessofanapproachneverhastobecontingentonthedatasetused. Anotherimprovementwouldbeamoredetailedfeaturevector.Thestudiesabove usedfeaturevectorsofupto107points,comparedtomy11.Themoredetailedthe inputdata,themorenuancedapatternthenetworkiscapableofnding. Intwoofthestudiesmentionedabove,theperformanceofdi! erentmachinelearningtechniquesontheproblemofskinlesionclassicationwerecompared.Severalapproacheswereinstitutedwithhigherlevelsofsuccessthanarticialneuralnetworks. ThemostpromisingoftheseapproachesisboostedHaarfeatures. Amongpastmachinelearningapproachestomedicalimagerecognition,themachinelearningtechniqueofboostedHaarfeaturesisrelativelycommon.Haar-like featuresaredigitalimagefeaturesusedintheeldofobjectrecognition,namedfor theirintuitivesimilaritytotheHaarwavelet(seegure12).Awaveletisanonlinear function,collectionsofwhichformbasesthatareusedtoreconstructmorecomplex functionsviaprojection.TheHaarwaveletisthesimplestpossiblewavelet.Aset ofHaarwaveletscanbetransformedintoanyotherwavebyscalingeachindividuallyandsummingtheset.TheuseofHaar-likefeaturescategorizesanimageinto 52 PAGE 58 Figure13:Summedareatable subsectionsbasedonthedi! erencebetweensummedpixelintensitiesinadjacentrectangularregionsataspeciclocationinadetectionwindow.Forinstance,inthefacial recognitionproblem,Haarwaveletsareusedtodistinguisheyes,whicharecommonly shadowed,fromcheeks,whicharecommonlythelightestpartoftheface.Thekey advantageofdetectingaHaar-likefeatureoverothertypesoffeaturesisthecalculationspeed.Ifasummedareatable(alsoknownasanintegratedimage)isused, aHaar-likefeatureofanysizecanbecalculatedinconstanttime.Asummedarea tableisaquickande" cientalgorithmthatstoressummedvaluesofarectangular subsetofagrid,wherethevalueatanypoint(x,y)inthesummedareatableisthe sumoftheentriesfrom(0,0)to(x,y)inclusiveinthedatatable,asseeninFigure 13. SeveralpopularalgorithmsutilizeHaar-likefeatures,suchastheViola-Jonesobjectdetectionframework. 27 Proposedin2001asasolutiontothefacialrecognition problem,itwastherstframeworktoprovidecompetitiveobjectdetectionratesin realtime.Inthedetectionphase,atargetsizedwindowispassedovertheinput image,deningsubsections.Foreachsubsectionoftheimagethepixelintensities aresummed,thuscalculatingtheHaar-likefeature.Thedi! erenceisthencompared againstalearnedthresholdthatdistinguishesobjectsfromnon-objects.Duetoits simplicity,aHaar-likefeatureisonlyaweaklearnerorclassier;itsdetectionqual27 FromPaulViolaandMichaelJones,"RobustReal-timeObjectDetection"[27] 53 PAGE 59 ityisonlyslightlybetterthanrandomguessing.Thereforeeitheralargenumberof Haar-likefeaturesoraboostingalgorithmtostrengthenthemostrepresentativeones arenecessarytodescribeanobjectwithsu"cientaccuracy. 28 Boostingisamachinelearningmeta-algorithmimplementedalongsidesupervised learningalgorithmstotransformasetofweaklearnersintoasinglestronglearner;a classierwhoseperformanceisstronglycorrelatedwiththetrueclassication.Metaalgorithmsareiterativeoptimizationalgorithmsthattrytoimproveacandidatesolution(amemberofthesetofallfeasiblesolutionstothegivenproblem),withregard toagivenmeasureofquality.InReddyandBhuyan'spaper,Haar-likefeaturesare usedinconjunctionwiththeAdaBoostalgorithmtoincreasetheirdetectionpower. TheAdaBoost(adaptiveboosting)algorithmpredisposesclassiersbuiltlaterinthe processinfavorofinstancesmisclassiedbypreviousclassiersbyuseofweights. 29 In eachrun,itgeneratesandcallsanewweakclassier,andupdatesthedistributionof weightstoindicatetheimportanceofspecicdatafortheclassicationbyincreasing theweightsofincorrectlyclassiedexamplesanddecreasingthoseofcorrectlyclassiedones.Thiscausesthenewclassiertofocusonthemoreheavilyweightedformer examples.InReddyandBhuyan'spaper,theimagecategoryistrained,thenthe weightsandweakclassiersarestored.Next,theAdaBoostalgorithmtakesseveral weakclassiers(theHaar-likefeatures)anddevelopsthemintostrongermodelsafter anumberofiterations,atwhichpointthehighlyselectivefeaturesthatminimize theclassicationerrorareextracted.TheAdaBoostalgorithmwastherstboostingalgorithmtogainprominence.Sincethen,LPBoost,TotalBoost,BrownBoost, MadaBoost,LogitBoost,andmorehavehadsuccessaswell. BoostedHaarfeaturescouldbeusedinconjunctionwithanarticialneuralnet 28 FromChandanK.ReddyandFahimaA.Bhuyan,"RetrievalandRankingofBiomedicalImages usingBoostedHaarFeatures"[23] 29 FromYoavFreundandRobertE.Schapire,"AShortIntroductiontoBoosting"[17] 54 PAGE 60 Figure14:Flowofcode forincreasedclassifyingpower.Onecouldbeconstructedforeachfeatureofafeature vector,thentheresultantvectorcouldbeclassiedbyaneuralnetworkinthesame mannerasthisproject.Thiswoulddrasticallyimprovethefeatureextractionprocess, whichisbasictobeginwith,thenadditionallyissettoaverageparametervaluesthat performedwellwithsomeimagesandpoorlywithothers.Insteadofincreasingthe quantityofthefeaturevectorassuggestedabove,thiswouldincreaseitsquality. 55 PAGE 61 7Codeandsampleoutput Theprojectiscomprisedofeightmodules,eachrepresentingaspecictask, specicallydesignedtobeabletobeusedbothindependentlyandinconjunction withothermodules.TheowofdependenciescanbeseeninFigure14.Classify.pyrstcallspreprocess.py,whichusescolor_extract.py,corner_extract.py,and structure_extract.pytocomposeafeaturevectorforeachimage.Corner_extract.py andstructure_extract.pybothrequirebilaterallters,whichareappliedusingimage_blur.py.Next,classify.pysplitsthefeaturevectorsintotrainingandtestingsets, thenpasseseachtoarticial_neural_network.py,wherethenetworkisrsttrained, thentestedontherespectivesets.Finally,stats.pyiscalledtocalculatethestatistics accuracy,sensitivity,specicity,andaverageerror. 56 PAGE 62 Listing1:classify.py import artificial_neural_network import preprocess import stats import random import pickle import pprint pp=pprint.PrettyPrinter() n_mels=101 n_moles=129 train_imgs=[] test_imgs=[] #TRAIN_IMGSANDTEST_IMGSAREARRAYSOFSTRINGSREPRESENTING THEFILEPATHSTOTHEIMAGES for i in range(n_mels): img_str='lesion_images/mel'+str(i+1)+'.jpg' if random.random()>=0.2: train_imgs.append(img_str) else : test_imgs.append(img_str) for i in range(n_moles): img_str='lesion_images/mole'+str(i+1)+'.jpg' if random.random()>=0.2: train_imgs.append(img_str) else : test_imgs.append(img_str) #BEGIN print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print print PREPROCESSINGTHETRAININGSET' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' prep_train=preprocess.Preprocessor(train_imgs) train_lesions=prep_train.process() print '\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print 57 PAGE 63 print TRAININGTHENETWORK' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' ann=artificial_neural_network.ANN(11,8,1) ann.train(train_lesions,5000) print '\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print print PREPROCESSINGTHETESTSET' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' prep_test=preprocess.Preprocessor(test_imgs) test_lesions=prep_test.process() print '\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print print CLASSIFYINGTHETESTSET' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' results=ann.test(test_lesions) pp.pprint(results) print '\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print print ACCURACYSTATISTICS' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' [counts,sens,spec,acc,err]=stats.accuracy(results) [total,correct,t_mel,c_mel,t_mole,c_mole]=counts print 'Sizeofthetestset:',len(results) print '\nSensitivity=',sens 100,'%' print '\tTotalmelanoma=',t_mel print '\tCorrectlyidentifiedmelanoma=',c_mel print '\nSpecificity=',spec 100,'%' print '\tTotalbenignmelanocyticlesions=',t_mole print '\tCorrectlyidentifiedbenignmelanocyticlesions=', c_mole print '\nAccuracy=',acc 100,'%' print '\tTotallesions=',total print '\tCorrectlyidentifiedlesions=',correct 58 PAGE 64 print '\nAverageerror=',err 100,'%' #SAVETHETRAINEDANN saved_ann=open('trained_ann.obj','w') pickle.dump(ann,saved_ann) #SAVETHETRAINDATA saved_train_lesions=open('train_lesions.obj','w') pickle.dump(train_lesions,saved_train_lesions) #SAVETHETESTDATA saved_test_lesions=open('test_lesions.obj','w') pickle.dump(test_lesions,saved_test_lesions) #SAVETHERESULTS saved_results=open('results.obj','w') pickle.dump(results,saved_results) 59 PAGE 65 Listing2:preprocess.py import cv import color_extract import corner_extract import structure_extract class Preprocessor: def __init__(self,imgpaths): self.imgs=[] for i in imgpaths: img=cv.LoadImageM(i,1) if "mel" in i: classification=1 elif "mole" in i: classification=0 self.imgs.append((img,classification) ) def process(self): results=[] n=1 for i in self.imgs: print "///////PROCESSINGIMAGE",n," ///////" colors=color_extract.getColors(i [0]) corners=corner_extract.getCorners(i [0]) structures=structure_extract. getStructures(i[0]) feature_vector=colors+corners+ structures results.append([feature_vector,[i [1]]]) n+=1 return results 60 PAGE 66 Listing3:articial-neural-network.py import numpyasnp import pprint pp=pprint.PrettyPrinter() class ANN: def __init__(self,ni,nh,no): #NUMBEROFINPUT,HIDDEN,ANDOUTPUTNODES self.ni=ni+1 self.nh=nh self.no=no #ACTIVATIONS self.ai=[1.0] self.ni self.ah=[1.0] self.nh self.ao=[1.0] self.no #WEIGHTMATRICES self.wi=np.random.random((self.ni,self.nh) ) self.wo=np.random.random((self.nh,self.no) ) #LASTCHANGEINWEIGHTSFORMOMENTUM self.ci=np.zeros(shape=(self.ni,self.nh)) self.co=np.zeros(shape=(self.nh,self.no)) def update(self,inputs): if len(inputs)!=self.ni 1: raise ValueError,'Error:incorrect numberofinputs' #INPUTACTIVATIONS for i in range(self.ni 1): self.ai[i]=inputs[i] #HIDDENACTIVATIONS[sum(wi xi bias)] for j in range(self.nh): sigma=0.0 for i in range(self.ni): sigma+=self.ai[i] self.wi[i][j] 61 PAGE 67 self.ah[j]=1.0/(1.0+np.exp( sigma)) #OUTPUTACTIVATIONS[sum(wi xi bias)] for j in range(self.no): sigma=0.0 for i in range(self.nh): sigma+=self.ah[i] self.wo[i][j] self.ao[j]=1.0/(1.0+np.exp( sigma)) return self.ao def backPropagate(self,targets,N,M): if len(targets)!=self.no: raise ValueError,'Error:incorrect numberoftargetvalues' #ERRORTERMSFOROUTPUT output_deltas=np.zeros(self.no) for i in range(self.no): ao=self.ao[i] output_deltas[i]=ao (1 ao) (targets[i] ao) #ERRORTERMSFORHIDDEN hidden_deltas=np.zeros(self.nh) for i in range(self.nh): sigma=0.0 for j in range(self.no): sigma+=output_deltas[j] self.wo[i ][j] hidden_deltas[i]=self.ah[i] (1 self. ah[i]) sigma #UPDATEOUTPUTWEIGHTS for i in range(self.nh): for j in range(self.no): change=output_deltas[j] self.ah[i ] self.wo[i][j]=self.wo[i][j]+N change+M self.co[i][j] self.co[i][j]=change #UPDATEINPUTWEIGHTS 62 PAGE 68 for i in range(self.ni): for j in range(self.nh): change=hidden_deltas[j] self.ai[i ] self.wi[i][j]=self.wi[i][j]+N change+M self.ci[i][j] self.ci[i][j]=change #TOTALERROR error=0.0 for i in range(self.no): delta=targets[i] self.ao[i] error+=0.5 delta !! 2 return error def train(self,patterns,iterations=5000,N=0.5,M =0.9): #N:learningrate #M:momentumfactor for i in xrange(iterations): error=0.0 for p in patterns: inputs=p[0] targets=[p[1][0]] self.update(inputs) error+=self.backPropagate(targets, N,M) if i%100==0: print 'error:',error def test(self,patterns): sus=[] for p in patterns: su=self.update(p[0]) sus.append(su[0]) for i in range(len(sus)): patterns[i][1].append(sus[i]) return patterns 63 PAGE 69 Listing4:stats.py def accuracy(results): total=len(results) correct=0 t_mel=0 t_mole=0 c_mel=0 c_mole=0 agg_error=0 for r in results: ex=r[1][0] act=r[1][1] agg_error+=abs(ex act) if act<.5: act_t=0 else : act_t=1 if ex: #SENSITIVITY(CORRECTPOSITIVES) t_mel+=1 if act_t: c_mel+=1 correct+=1 else : #SPECIFICITY(CORRECTNEGATIVES) t_mole+=1 ifnot act_t: c_mole+=1 correct+=1 sensitivity=c_mel/float(t_mel) specificity=c_mole/float(t_mole) accuracy=correct/float(total) avg_error=agg_error/float(total) return [[total,correct,t_mel,c_mel,t_mole,c_mole ],sensitivity,specificity,accuracy,avg_error] 64 PAGE 70 Listing5:color-extract.py import cv import numpyasnp def getColors(img_m): print "...extractingcolors" img=np.asarray(img_m) brown_min=[120,65,65] brown_max=[200,130,130] white_min=[190,190,190] white_max=[255,255,255] black_min=[0,0,0] black_max=[80,80,80] blue_min=[50,50,125] blue_max=[130,130,255] red_min=[130,50,50] red_max=[255,100,100] colors=[] colors.append(searchColor(img,brown_min,brown_max)) colors.append(searchColor(img,white_min,white_max)) colors.append(searchColor(img,black_min,black_max)) colors.append(searchColor(img,blue_min,blue_max)) colors.append(searchColor(img,red_min,red_max)) for i in range(len(colors)): if (colors[i]): colors[i]=1 else : colors[i]=0 return colors def searchColor(img_,cmin,cmax): img__=np.asarray(img_) img=np.array(img__) for row in range(img.shape[0]): for col in range(img.shape[1]): x=img[row][col] if cmin[2] PAGE 71 and cmin[1] PAGE 72 Listing6:corner-extract.py import cv import cv2 import numpy import image_blur def getCorners(img): print "...extractingcorners" img_b=image_blur.blurImage(img) img_arr=numpy.asarray(img_b) img_g_mat=cv2.cvtColor(img_arr,cv2.COLOR_RGB2GRAY) img_g=cv.fromarray(img_g_mat) cv.Scale(img_g,img_g,2) cv.Canny(img_g,img_g,200,220) eig_image=cv.CreateMat(img_g.rows,img_g.cols,cv. CV_8UC1) temp_image=cv.CreateMat(img_g.rows,img_g.cols,cv. CV_8UC1) corners=[] strength_thresh=0.8 for (x,y) in cv.GoodFeaturesToTrack(img_g,eig_image, temp_image,5,strength_thresh,10.0,blockSize =5): corners.append(1) while (len(corners)<5): corners.append(0) return corners 67 PAGE 73 Listing7:structure-extract.py import cv import cv2 import numpy import image_blur def getStructures(img): print "...extractingstructures" #CONVERTTOBLACKANDWHITE img_a=numpy.asarray(img) img_c=cv2.cvtColor(img_a,cv.CV_RGB2GRAY) img_c=cv.fromarray(img_c) #DEFINEREGIONOFINTEREST img_c=image_blur.blurImage(img_c) stor=cv.CreateMemStorage() img2=cv.CloneMat(img_c) cv.Zero(img2) cv.Threshold(img_c,img2,100,255,cv.CV_THRESH_BINARY) cont=cv.FindContours(img2,stor,cv.CV_RETR_LIST,cv. CV_CHAIN_APPROX_NONE,(0,0)) maxcenterx=0 maxcentery=0 maxsizex=0 maxsizey=0 maxangle=0 contour=None measures2=[] for c in contour_iterator(cont): if len(c)>=6: PointArray2D32f=cv.CreateMat(1,len (c),cv.CV_32FC2) for (i,(x,y)) in enumerate(c): PointArray2D32f[0,i]=(x,y ) #FITANELLIPSEAROUNDTHEPOINTS (center,size,angle)=cv. FitEllipse2(PointArray2D32f) 68 PAGE 74 center=(cv.Round(center[0]),cv. Round(center[1])) size=(cv.Round(size[0] 0.5),cv. Round(size[1] 0.5)) #STORETHENEWMAXELLIPSES if size[0]>maxsizex or size[1]> maxsizey: maxcenterx=center[0] maxcentery=center[1] maxsizex=size[0] maxsizey=size[1] maxangle=angle contour=c r=cv.BoundingRect( PointArray2D32f) measures2.append(r) #RETURNTHELARGESTELLIPSEORRECTANGLE(THATISN'T THEENTIREBORDEROFTHEIMAGE) if (len(measures2)<2): return [0] r=measures2[ 2] cv.Scale(img_c,img_c,2) cv.Canny(img_c,img_c,200,220) img_c=numpy.asarray(img_c) count=0 total=0 for x in range(r[2] 1): for y in range(r[3] 1): i=img_c[r[1]+y][r[0]+x] if (i.any()): count+=1 total+=1 #DELETEACOUNTDENOTINGTHECIRCUMFERENCEOFTHE ELLIPSETHATFITSINSIDETHEREGION,INANATTEMPT TONORMALIZE !! USECIRCUMFERENCEOFCIRCLEWITH RADIUSAVERAGEOFTHETWOAXES circum=((r[2]+r[3])/4) 3.14 ratio=(count circum)/float(total) 100 69 PAGE 75 #RETURNSTRUEIFRATIOISGREATERTHAN5%OFTHE REGIONOFINTEREST,ANDIFTHEREGIONIS SUFFICIENTLYLARGE if (ratio>5 and r[2] r[3]>20): structures=[1] else : structures=[0] return structures def contour_iterator(contour): while contour: yieldcontour contour=contour.h_next() 70 PAGE 76 Listing8:image-blur.py import cv import cv2 import numpy def blurImage(img): img_arr=numpy.asarray(img) sigma_color=60 sigma_space=20 bi_filt_mat=cv2.bilateralFilter(img_arr, 1, sigma_color,sigma_space) bi_filt=cv.fromarray(bi_filt_mat) return bi_filt 71 PAGE 77 Listing9:Sampleoutput !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! IMPORTINGTHEDATA !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! TRAININGTHENETWORK !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! error:5.09075367194 error:17.1295229478 error:18.3122883278 error:18.0245064757 error:18.2418582286 error:18.1364958022 error:17.9341228693 error:17.7953834987 error:17.3095563311 error:17.0625905663 error:16.7112176434 error:16.6339603499 error:16.9568052182 error:16.8266045433 error:16.3518436498 error:16.3268902898 error:16.2248581036 error:16.2395903161 error:16.2494250968 error:16.2526632909 error:16.2271156423 error:16.2233162416 error:16.2216423189 error:16.2213608217 error:16.221024769 error:16.2202050491 error:16.2186539405 error:16.2160403486 error:16.2120485318 error:16.2063557914 72 PAGE 78 error:16.1987450176 error:16.1896409496 error:16.1801999145 error:16.1695773391 error:16.1570886199 error:16.1479393628 error:16.1410826955 error:16.1352364442 error:16.1299563312 error:16.1250819598 error:16.1205362738 error:16.1162716589 error:16.1122542263 error:16.1084579056 error:16.1048616164 error:16.1014477166 error:16.0982010843 error:16.0951085404 error:16.0921584619 error:16.089340507 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CLASSIFYINGTHETESTSET !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [[[1,0,1,0,1,1,1,1,1,1,0],[1, 0.50000000255680987]], [[1,1,1,0,0,1,1,1,0,0,0],[1, 0.50011964901768124]], [[1,1,1,0,0,1,1,1,1,1,1],[1, 0.99637516479451549]], [[1,1,1,0,1,1,1,1,1,1,0],[1, 0.50000739804787309]], [[1,1,1,0,0,1,1,1,1,1,1],[1, 0.99637516479451549]], [[1,0,0,0,1,1,1,1,1,0,1],[1, 0.30372364761351178]], [[0,1,1,0,0,1,1,1,1,1,1],[1, 0.98529907712560372]], [[1,0,1,0,0,1,1,1,1,1,0],[1, 0.50001511939740761]], [[1,0,1,0,1,1,1,1,1,1,0],[1, 0.50000000255680987]], 73 PAGE 79 [[1,0,0,0,0,1,1,1,1,1,0],[1, 0.023723619286717772]], [[1,0,1,0,0,1,1,1,1,1,1],[1, 0.50093168418064304]], [[1,1,1,0,0,1,1,1,1,1,0],[1, 0.99637511307992255]], [[1,1,0,0,0,1,1,1,1,1,0],[1, 0.7723983215865774]], [[1,0,1,0,0,1,1,1,1,1,1],[1, 0.50093168418064304]], [[1,1,1,0,0,1,1,1,1,1,1],[1, 0.99637516479451549]], [[0,1,1,0,0,1,1,1,1,1,1],[1, 0.98529907712560372]], [[0,1,0,0,1,1,1,1,1,1,0],[1, 0.30442403432660664]], [[1,0,1,0,0,1,1,1,1,1,1],[1, 0.50093168418064304]], [[1,0,1,0,0,1,1,1,1,1,0],[1, 0.50001511939740761]], [[1,0,1,0,1,1,1,1,1,1,0],[1, 0.50000000255680987]], [[0,1,1,0,0,1,1,1,1,1,0],[1, 0.86837800350830896]], [[0,0,1,0,0,1,1,0,0,0,1],[0, 0.016370723537866984]], [[1,0,0,0,0,1,1,0,0,0,1],[0, 0.016379133648241291]], [[1,0,1,0,0,1,1,0,0,0,1],[0, 0.42669862657791791]], [[0,0,1,0,1,1,1,0,0,0,0],[0, 0.27219357184337278]], [[1,0,1,0,1,1,1,1,1,1,0],[0, 0.50000000255680987]], [[1,1,0,0,1,1,1,0,0,0,0],[0, 0.68997819573215236]], [[0,1,1,0,0,1,0,0,0,0,1],[0, 0.019340830455362307]], [[0,1,0,0,0,1,0,0,0,0,0],[0, 0.017019156010367409]], [[1,0,1,0,1,1,1,1,0,0,1],[0, 0.5000621934971573]], [[1,0,1,0,0,1,1,1,1,1,1],[0, 0.50093168418064304]], [[1,0,0,0,1,1,1,1,1,1,0],[0, 74 PAGE 80 0.49364998052874182]], [[0,1,0,0,0,1,1,0,0,0,1],[0, 0.61642573521371946]], [[1,0,1,0,0,1,1,0,0,0,1],[0, 0.42669862657791791]], [[0,0,0,0,0,1,1,1,1,1,1],[0, 0.016370840596957523]], [[1,0,0,0,1,1,1,0,0,0,0],[0, 0.041809036044299468]], [[1,0,0,0,1,1,1,1,1,1,0],[0, 0.49364998052874182]], [[1,0,0,0,0,1,1,1,1,0,1],[0, 0.016370495836683031]], [[1,0,1,0,0,1,1,1,1,1,1],[0, 0.50093168418064304]], [[0,0,0,0,0,1,1,0,0,0,1],[0, 0.016371999410031139]]] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ACCURACYSTATISTICS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Sizeofthetestset:40 Sensitivity=85.7142857143% Totalmelanoma=21 Correctlyidentifiedmelanoma=18 Specificity=68.4210526316% Totalbenignmelanocyticlesions=19 Correctlyidentifiedbenignmelanocyticlesions=13 Accuracy=77.5% Totallesions=40 Correctlyidentifiedlesions=31 Averageerror=33.3738844071% 75 PAGE 81 References [1]Ai-junkie.com. [2]dermquest.org. [3]Mayoclinic,August2011. [4]Deeplearning.net,2012. [5]Dermoscopy.org,2012. [6]Nationalcancerinstitutes,2012. [7]Worldhealthorganization,2012. [8]JamesA.Anderson. AnIntroductiontoNeuralNetworks .MassachusettsInstituteofTechnology,thirdedition,1997. [9]RossBerteig.Basicconceptsforneuralnetworks. [10]ChristopherBishop. PatternRecognitionandMachineLearning .SpringerScience+BusinessMedia,LLC,2006. [11]GaryBradskiandAdrianKaehler. LearningOpenCV:ComputerVisionwiththe OpenCVLibrary .O'ReillyMedia,Inc,rstedition,September2008. [12]PabloG.CavalcantiandJacobScharcanski.Automatedprescreeningofpigmentedskinlesionsusingstandardcameras. ComputerizedMedicalImagingand Graphics ,pages481491,2011. [13]FikretErcalet.al.Neuralnetworkdiagnosisofmalignantmelanomafromcolor images. TransationsonBiomedicalEngineering ,41(9),September1994. [14]G.Argenzianoet.al.Epiluminescencemicroscopyforthediagnosisofdoubtful melanocyticskinlesions.comparisonoftheabcdruleofdermatoscopyandanew 7-pointchecklistbasedonpatternanalysis. ArchDermatol ,December1998. [15]StephanDreiseitlet.al.Comparisonofmachinelearningmethodsforthediagnosisofpigmentedskinlesions. JournalofBiomedicalInformatics ,pages2836, January2001. [16]YiDinget.al.Acomputerassisteddiagnosissystemformalignantmelanoma using3dskinsurfacetexturefeaturesandarticialneuralnetwork. Int.J.Modelling,IdenticationandControl ,2009. [17]YoavFreundandRobertE.Schapire.Ashortintroductiontoboosting. Journal ofJapaneseSocietyforArticialIntelligence ,14(5):771780,September1999. [18]DonaldHebb. Theorganizationofbehavior .WileyandSons,1949. 76 PAGE 82 [19]Geo!reyE.Hinton.Deepbeliefnetworks. Scholarpedia ,4(5):5947,2009. [20]IliasMaglogiannisandCharlamposN.Doukas.Overviewofadvancedcomputer visionsystemsforskinlesionscharacterization. IEEETransactionsonInformationTechnologyinBiomedicine ,13(5):721733,September2009. [21]FranzNachbarandWilhelmStolz.Theabcdruleofdermatoscopy:Highprospectivevalueinthediagnosisofdoubtfulmelanocyticskinlesions. Journalofthe AmericanAcademyofDermatology ,30(4):551559,April1994. [22]OllyOechsle.Distinguishingmolesfrommelanomausingjasmine. [23]C.K.ReddyandF.A.Bhuyan.Retrievalandrankingofbiomedicalimagesusing boostedhaarfeatures.In BioInformaticsandBioEngineering,2008.BIBE2008. 8thIEEEInternationalConferenceon ,pages16,October2008. [24]NeilSchemenauer.http://arctrix.com/nas/. [25]StevenSkiena. CalculatedBets:Computers,Gambling,andMathematicalModelingtoWin .UniversityofCambridge,August2001. [26]ShereaM.Stricklin.Melanomainsituinaprivatepracticesetting2005through 2009:Location,lesionsize,lackofconcern. JAmAcadDermatol ,Jan2012. [27]PaulViolaandMichaelJones.Robustreal-timeobjectdetection. International JournalofComputerVision ,2001. [28]D.H.WolpertandW.G.Macready.Nofreelunchtheoremsforoptimization. IEEETransactionsonEvolutionaryComputation ,1(1):6782,April1997. 77 |