ERROR LOADING HTML FROM SOURCE (http://ncf.sobek.ufl.edu//design/skins/UFDC/html/header_item.html)

Identifying Melanoma Using Computer Vision and an Artificial Neural Network

Permanent Link: http://ncf.sobek.ufl.edu/NCFE004659/00001

Material Information

Title: Identifying Melanoma Using Computer Vision and an Artificial Neural Network
Physical Description: Book
Language: English
Creator: Rivers, Hannah
Publisher: New College of Florida
Place of Publication: Sarasota, Fla.
Creation Date: 2012
Publication Date: 2012

Subjects

Subjects / Keywords: Melanoma
Artificial Neural Network
Computer vision
Genre: bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Melanoma is an increasingly common and very serious form of cancer, claiming almost fifty thousand lives annually. The single most important factor in successful treatment of melanoma is early detection. In this thesis, an artificial neural network designed to distinguish melanoma from dysplastic nevi was created and trained. Heuristics employed by dermatologists to diagnose melanoma were synthesized into a preprocessing method that extracts relevant features, transforming an image into a feature vector. These vectors were then passed to the artificial neural network so it could learn patterns in the data. The artificial neural network was trained on an amalgamation of images from two publicly available data sets consisting of images of 101 melanoma and 129 benign melanocytic lesions. Because the performance of artificial neural networks is stochastic, results were averaged over a variety of parameter and starting values. On average, the network is capable of distinguishing between the lesions with 64% accuracy, with 53% sensitivity and 73% specificity. At best, its accuracy is 78%, with 86% sensitivity and 68% specificity.
Statement of Responsibility: by Hannah Rivers
Thesis: Thesis (B.A.) -- New College of Florida, 2012
Supplements: Accompanying materials: Code and Data Set
Electronic Access: RESTRICTED TO NCF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE
Bibliography: Includes bibliographical references.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The New College of Florida, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Local: Faculty Sponsor: McDonald, Patrick

Record Information

Source Institution: New College of Florida
Holding Location: New College of Florida
Rights Management: Applicable rights reserved.
Classification: local - S.T. 2012 R62
System ID: NCFE004659:00001

Permanent Link: http://ncf.sobek.ufl.edu/NCFE004659/00001

Material Information

Title: Identifying Melanoma Using Computer Vision and an Artificial Neural Network
Physical Description: Book
Language: English
Creator: Rivers, Hannah
Publisher: New College of Florida
Place of Publication: Sarasota, Fla.
Creation Date: 2012
Publication Date: 2012

Subjects

Subjects / Keywords: Melanoma
Artificial Neural Network
Computer vision
Genre: bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Melanoma is an increasingly common and very serious form of cancer, claiming almost fifty thousand lives annually. The single most important factor in successful treatment of melanoma is early detection. In this thesis, an artificial neural network designed to distinguish melanoma from dysplastic nevi was created and trained. Heuristics employed by dermatologists to diagnose melanoma were synthesized into a preprocessing method that extracts relevant features, transforming an image into a feature vector. These vectors were then passed to the artificial neural network so it could learn patterns in the data. The artificial neural network was trained on an amalgamation of images from two publicly available data sets consisting of images of 101 melanoma and 129 benign melanocytic lesions. Because the performance of artificial neural networks is stochastic, results were averaged over a variety of parameter and starting values. On average, the network is capable of distinguishing between the lesions with 64% accuracy, with 53% sensitivity and 73% specificity. At best, its accuracy is 78%, with 86% sensitivity and 68% specificity.
Statement of Responsibility: by Hannah Rivers
Thesis: Thesis (B.A.) -- New College of Florida, 2012
Supplements: Accompanying materials: Code and Data Set
Electronic Access: RESTRICTED TO NCF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE
Bibliography: Includes bibliographical references.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The New College of Florida, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Local: Faculty Sponsor: McDonald, Patrick

Record Information

Source Institution: New College of Florida
Holding Location: New College of Florida
Rights Management: Applicable rights reserved.
Classification: local - S.T. 2012 R62
System ID: NCFE004659:00001


This item is only available as the following downloads:


Full Text

PAGE 1

IDENTIFYINGMELANOMAUSINGCOMPUTERVISIONANDAN ARTIFICIALNEURALNETWORK BY HANNAHRIVERS AThesis SubmittedtotheDivisionofNaturalSciences NewCollegeofFlorida inpartialfulllmentoftherequirementsforthedegreeBachelorofArts UnderthesponsorshipofPatMcDonald Sarasota,Florida April,2012

PAGE 2

Contents 1Introductiontotheproblem1 1.1Factsaboutmelanoma..........................1 1.2Backgroundinformationonhumanclassicationofmelanoma....2 1.3Backgroundinformationonmachinelearningapproaches.......4 2Computervision8 2.1Anintroductiontocomputervision...................8 3Theoryofneuralcomputation10 3.1Anintroductiontoneurocomputing...................10 3.2Biologicalneuralnetworks........................11 3.3Articialneuralnetworks.........................13 3.4Typesofproblemsforwhicharticialneuralnetsarebestsuited...15 3.5Whyanarticialneuralnetworkisagoodtforthisproblem....17 4Articialneuralnetworks18 4.1Necessaryconcepts............................18 4.2Theperceptron..............................22 4.3Themulti-layerperceptron........................25 5Consideredandselectedapproaches31 5.1Questionsofintent............................31 5.2Preprocessingtechniques.........................32 5.3Selectionoffeatures...........................33 5.3.1ABCDrule(Stolzmethod)....................35 5.3.2Menzies'method.........................37 5.3.3Seven-pointchecklist.......................37 5.3.4Formulatedmethodusedinthisthesis.............38 ii

PAGE 3

5.4Practicallimitations...........................41 5.5Software..................................43 5.5.1Computervision.........................43 5.5.2Articialneuralnetwork.....................43 6Finalproduct47 6.1Results...................................47 6.1.1Projectresults...........................47 6.1.2Similarprojectresults......................50 6.2Possibleimprovements..........................51 7Codeandsampleoutput56 References76 iii

PAGE 4

ListofFigures 1Aneuron..................................11 2Articialneuron..............................14 3Gradientdescent.............................18 4Theperceptronfunction.........................22 5Multi-layerperceptron..........................25 6Stolz'sABCDruleforidentifyingmelanoma..............36 7Colordetection..............................39 8Cornerdetection.............................39 9Structuredetection............................40 10FirststepofXORfunctionnetworktrainingforrstinput......46 11Results...................................48 12TheHaarwavelet.............................52 13Summedareatable............................53 14Flowofcode................................55 iv

PAGE 5

IDENTIFYINGMELANOMAUSINGCOMPUTERVISIONANDAN ARTIFICIALNEURALNETWORK HannahRivers NewCollegeofFlorida,2012 ABSTRACT Melanomaisanincreasinglycommonandveryseriousformofcancer,claiming almostftythousandlivesannually.Thesinglemostimportantfactorinsuccessful treatmentofmelanomaisearlydetection. Inthisthesis,anarticialneuralnetworkdesignedtodistinguishmelanomafrom dysplasticneviwascreatedandtrained.Heuristicsemployedbydermatologiststo diagnosemelanomaweresynthesizedintoapreprocessingmethodthatextractsrelevantfeatures,transforminganimageintoafeaturevector.Thesevectorswerethen passedtothearticialneuralnetworksoitcouldlearnpatternsinthedata. Thearticialneuralnetworkwastrainedonanamalgamationofimagesfromtwo publiclyavailabledatasetsconsistingofimagesof101melanomaand129benign melanocyticlesions.Becausetheperformanceofarticialneuralnetworksisstochastic,resultswereaveragedoveravarietyofparameterandstartingvalues.Onaverage, thenetworkiscapableofdistinguishingbetweenthelesionswith64%accuracy,with 53%sensitivityand73%specicity.Atbest,itsaccuracyis78%,with86%sensitivity and68%specicity. PatMcDonald DivisionofNaturalSciences v

PAGE 6

1Introductiontotheproblem 1.1Factsaboutmelanoma 1 ThemostcommonformofcancerintheUnitedStatesisskincancer,a! ecting betweentwoandthreemillionpeoplegloballyeachyearandcomprisingathirdof alldiagnosedcasesofcancer.AccordingtotheWorldHealthOrganization,onein everyveAmericanswilldevelopskincancerintheirlifetime.Melanomaisatype ofskincancer;specicallyitisacancerofmelanocytes,thecellsresponsibleforthe pigmentationofskin.Whileitisamongtheleastcommontypes,itisthemostserious; accordingtotheNationalInstitutesofHealthroughly160,000casesarediagnosed globallyeachyear,leadingto48,000melanoma-relateddeaths.Theincidenceofboth non-melanomaandmelanomaskincancershasbeenincreasingoverthepastdecade. Althoughserious,melanomacanbecuredifitisdiagnosedandtreatedearly.If, however,itisnotremovedinitsinitialstages,thecancercellscanexpanddownward fromtheskinsurface,invadinghealthysurroundingtissueandorgans.Oncethishas occurred,itisverydi"culttocontrol.Consequently,themostimportantaspectof melanomatreatmentisearlyrecognition. Obviously,itisprudenttobecautiouswhenitcomestoabnormalmelanocytic lesions.Unfortunately,thiscautioncomesataprice;healthcareisexpensive.Many peoplecannota! ordtoconsultadoctororadermatologistateverysuspicion.Mitigatingthisconcernisthehopefuleventualityoftheapplicationofthisthesisand otherworkslikeit.Ingeneral,thetrendofmodernmedicineistocreatemachines thatperformtaskspreviouslyperformedbyhumans,ideallywithgreateraccuracyin lesstimeusinglessresources.Thetypeofworkinthisthesisistherststepdown thatroad. 1 FactsfromthewebsitesoftheNationalCancerInstitute,theWorldHealthOrganization,and theMayoClinic.[6][7][3] 1

PAGE 7

Onthatnote,itiscriticallyimportanttoissuethefollowingdisclaimer: Theintentofanyapplicationdirectlybasedonthisthesisistoserveas aguidetoindicatethelikelihoodofmelanoma.Itisexclusivelyaprimary recognitiontool,andisinnowaymeanttobeusedasadiagnostictoolor asareplacementforadoctor.Ifyouhaveseriousconcernsthatyoumay havemelanoma,itisimperativethatyouseekqualiedmedicaladvice. 1.2Backgroundinformationonhumanclassicationofmelanoma 2 Tobothaidinthediagnosisofandassistthepublicinrecognizingmelanoma, doctorsdevelopedtheheuristicacronym"ABCDE."Eachletterrepresentsacharacteristicindicativeofamelanoma,andmeetingoneormoreofthesecriteriamay indicateitsexistence.Therstcharacteristicisasymmetry.Commonnevi(benign, commonlesions,otherwiseknownasmoles)aregenerallysymmetricalcircles,while theshapeofamelanomaisoftenirregular.Thesecondcharacteristicregardsthe borderofthelesion.Thebordersofcommonneviareusuallyverywelldenedand obvious,fadingoutgradually,whileamelanomamayhavearagged,blurred,sharp, orotherwiseirregularborder.Thethirdcharacteristiciscolor.Typically,common neviareasingle,uniformshadeofmediumtodarkbrown.Melanomasmaybea varietyofcolorsincludingblack,red,white,orevenslightlyblue.Additionally,their colormaynotbeconsistent,butratherform"patches."Dependingonyoursource,the fourthdistinguishingcharacteristiciseitherdiameterordi! erentialstructures.While commonneviarealmostinvariablyonetothreemillimetersacross,amelanomais commonlylargerthansix.Additionally,melanomasmaycontainstructuresunseen incommonnevi,suchasglobules,streaks,andpatterns(seeSection5.3onpage33). 2 FactsfromthewebsiteoftheNationalCancerInstitute.[6] 2

PAGE 8

Thelastandmostindicativecharacteristicisevolutionofthelesion.Anychangesin shape,border,color,size,orstructurearehighlyindicativeofamelanoma,asisthe presenceofpain,itchiness,bleeding,oozing,orgeneralirritation. Duetothetime-sensitivenatureofmelanomatreatment,doctorsrecommend monthlyself-examinationstoinspectforskinlesionsttingtheABCDErule.It isimportanttofamiliarizeyourselfwithyourlesionssoastorecognizechanges.The useofahand-heldmirrorisrecommendedtoinspecthard-to-seeregions.Melanoma canoccuronthescalp,groin,ngernails,thesolesoffeet,andintheareabetween toes,soitisimperativetobethorough. Ifasuspiciouslesionisfound,thenextstepistocontactadermatologistor physician.Iftheysuspectthatit'spossiblyamelanoma,theywillbiopsythelesion andsendthesampletoapathologistforclassication.Therearefourtypesofbiopsies: shave,punch,incisional,andexcisional.Therstinvolvesshavingthelesiono!with athinsharpblade.Thesecondinvolvesatoolwithacircularblade,usedtoremove aroundsectionofskincontainingthelesioninquestion.Inthethird,onlythemost irregularportionofthegrowthisremovedforanalysis.Inthefourth,theentirelesion issurgicallyremoved.Theseareoftenperformedwhenthelikelihoodofmelanomais highandtimeisoftheessence.Withinafewdays,thepathologistwillinspectthe sampleandempiricallydeterminewhetherornotitiscancerous. Inthecaseofapositivediagnosis,theensuingactionistoquantifytheseverity ofthecondition.MelanomaisstagedbetweenIandIV,whereastageImelanoma issmallwithapredominantlysuccessfultreatmentrateandastageIVmelanoma hasspreadtootherorgansandsubsequentlyhasaverylowlikelihoodofsuccessful treatment.Thepathologistassignsthestagethroughtwomaincriteria:thickness andtheresultsofasentinelnodebiopsy.Thethicknessofalesioncanbemeasured underamicroscope.Becausemelanomasexpandrsthorizontallythendownwards intotheskin,thehigherthemeasurethemoreseriousthedisease.Asentinelnode 3

PAGE 9

biopsyisatestconductedbythedoctorwhichdetermineswhethercancercellshave spreadtonearbylymphnodesbyinjectingdyeintotheareaatwhichthemelanoma wasremoved,thenchartingitscoursethroughthelymphaticsystem.Therstnodes tocollectdyeareremovedandtestedforcancercells.Iftheyarenotpresentinthe nearestlymphnodes,itisunlikelythatthemelanomahasspread. Thedi!erentiationofearlymelanomafromdysplasticnevi(uncommonbenign lesionsthatresemblemelanoma)isdi" cultevenforexperienceddermatologists.In fact,underestimationoftheseverityofmelanomainitsearlystagesisdisconcertingly notuncommon.However,statisticallythemuchlargerissueisthepatientfailingto noticeortoconsultadoctorintime.Inthemajorityofcases,ifthepatientbrings thelesiontothedoctor'sattentionearly,melanomacanbeproperlyidentiedand treated. 1.3Backgroundinformationonmachinelearningapproaches Medicalimagerecognitionisnotanewapplicationfortheeldofmachine learning.Therehavebeennumerouspreviousendeavorsofskinlesionclassication (someofwhichwillbediscussedinSection6.2)andsimilarproblemswithvarying methodsandvaryinglevelsofsuccess,usuallydirectlycorrelatedwithaccessibilityof nancialresources. Despitetheirdi! erencesinmethods,allmachinelearningapproachestomedical imageclassicationnecessarilyshareasetoffundamentalcharacteristics,consisting ofimageacquisition,featureselectionandextraction,imageprocessingandanalysis, andclassication.Therststepinanyapproachistheacquisitionofasetofimages onwhichtotrainthemodel.Thenextistheextractionofpertinentdatafromthese images.Finally,saiddatamustbeprocessedinsuchamannerthatanalysisofthe resultsyieldsaclassication.Inmachinelearning,theprocessingisdonethrough aseriesofadaptivealgorithmsthatallowthemodeltoevolveitsbehaviorsbased 4

PAGE 10

onrelationshipsandpatternspresentinthedata.Thisconducesmoreintelligent decisionmakingthancanbedonewithastaticalgorithm. Ineachoftheaforementionedareas,thedetailsofoperationvary.Forinstance,the primaryquestionofimageacquisitioniswhattypeofimagestheprogramisultimately goingtobeusedtoclassify.Thesearetheimagesonwhichitisnecessarytotrainthe network.Thenextstepistodeterminewhereasetoftheseimagescanbeattained, keepinginmindthatsupervisedlearningrequiresadatasetwithknownoutcomes. Next,decisionsmustbemaderegardingfeatureselectionandextraction.Thesuccess ofanimagerecognitionprogramdependsonthecorrectselectionofdistinguishing featuresforclassication.Thefeaturesmustbequantiablymeasurableandhave bothhighsensitivity(theproportionofcorrectlyidentiedpositiveresponses)and highspecicity(theproportionofcorrectlyidentiednegativeresponses).Forthe problemofmelanomaidentication,sensitivityisbyfarthemoreimportantvalue. Itismuchmorepreferabletomistakenlyclassifyanegativevalueaspositivethan viceversa.Inthespeciceldofskinlesionclassication,diagnosticmethodssuch astheABCDrule(alsoknownasStolz'smethod),Menzies'method,andthesevenpointchecklisthavebeendevelopedwithe! ectivefeatureselectiontechniques.These methodsarefurtherexplainedinSection5.3.Oncethefeatureshavebeenselected, thenextstepistodeterminethemethodthroughwhichtheywillbeextracted.Many freewareandopensourcecomputervisionlibrariesandsoftwaresuitesforthispurposealreadyexist,ofvaryinglevelsofestablishment.Pendingfeatureextraction,the nextstepistoprocesstheextracteddata.Eachoftheaforementioneddiagnostic methodshasacorrespondingsetofrulesthroughwhichtoprocessfeaturedatainto information.Typically,thesefallunderthecategoryofstandardoptimizationproblemsthatcanberesolvedwithheuristicstrategies,greedyorgeneticalgorithms,other computationalintelligencemethodssuchasstrategiesderivedfromstatisticalpattern recognition,orinthiscase,articialneuralnets.Thenalstepoftheprocessisto 5

PAGE 11

taketheinformationgeneratedinthepreviousstepanddecideforwhichthresholds itshouldbeclassiedintodi! erentcategories,aswellaswhichcategoriestouse.For certainalgorithms,thisisautomaticallyperformedinthepreprocessingstep.Unsupervisedalgorithmssuchasclusteringandblindsignalseparationdetermineontheir ownwhichcategoriesproperlydividethedataset. Fortheproblemofskinlesionclassication,themostsubstantialissueisthatof trainingimageacquisition.Thisistheareainwhichbudgethasthemostlimiting e! ect.Ideally,epiluminencemicroscopy(ELM)canbeused.Otherwiseknownas dermoscopy,ELMistheexaminationofskinlesionswithadermatoscope.Thisprocessconsistsofamagnier,typicallyx10,anon-polarizedlightsource,atransparent plate,andaliquidmediumbetweentheinstrumentandtheskin,whichallowsdetailedinspectionofskinlesionsunobstructedbyreectionsontheskin'ssurface.It isusedtorendertheepidermistranslucent,makingthedermalfeaturesclearlyvisible.Dermoscopyimagesallowdetectionofearlymelanomasassmallas1mm 3 ;afeat impossiblewithstandardclinicalimages.Anotheroptionistransmissionelectronmicroscopy(TEM),asecondmicroscopytechniquewhichrevealsthestructureofelastic networksinthedermis.Abeamofelectronsistransmittedthroughanultra-thin specimen,whichtheelectronsinteractwithastheypassthrough.Fromthisinteractionanimageisformed;thatimageismagniedandfocusedontoauorescent screenonalayerofphotographiclm.Anotheroption,computedtomography(CT) scansusedigitalgeometryprocessingtogenerateathree-dimensionalimageofthe insideofanobjectfromalargeseriesoftwo-dimensionalX-ray"slices"takenarounda singleaxisofrotation.Multi-frequencyelectricalimpedancetomographycancreate animageofalesionbyrecordingandanalyzingtheresistancewithwhichelectric wavesofdi! erentfrequenciesmovethroughit.However,noneofthesemethodsare plausiblewithoutaprofessionalbudget.Instead,clinicalimagesareused.Acommon 3 FromShereaM.Stricklinetal."Melanomainsituinaprivatepracticesetting2005through 2009:Location,lesionsize,lackofconcern"[26] 6

PAGE 12

economicaltechniqueistousevideocamerasthatcanbecontrolledandparameterizedinrealtimeforamorethree-dimensionalanddetailedimagewithrelativelyhigh colorconstancy.Althoughthisisarelativelyinexpensivemethod,neuralnetworks areonlyusefulonthetypeofimagesonwhichtheyweretrained,soforanapplication intendedtobeusedbythegeneralpublicthemostreadilyavailableimagingdevices arehigh-resolution,low-distortioncameras.Unfortunately,theydon'taccountfor colorconstancy,andit'sdi" culttogetalarge,centered,focused,anddetailedimage ofthelesion.Regardless,itisthemostplausibleoption,andsubsequentlytheone pursuedinthisthesis. 4 4 FromIliasMaglogiannisandCharalamposN.Doukas,"OverviewofAdvancedComputerVision SystemsforSkinLesionsCharacterization"[20] 7

PAGE 13

2Computervision 5 2.1Anintroductiontocomputervision Computervisionisaeldofcomputersciencethattriestoprogramcomputersto "see,"orcomprehendvisualdata.GaryBradskiandAdrianKaehlerdeneitas"the transformationofdatafromastillorvideocameraintoeitheradecisionoranew representation,performedinthecontextofaspecicpurposeortask."Theydenea decisionasoutputofthetype"anobjectexistsinthisimage"andrepresentationas outputsuchasremovingcameramotionfromanimagesequence.Theinputdatamay includesomecontextualinformationknownasmetadatathatcanaidthecomputer initstaskbyprovidingpreliminarydiscriminators. Althoughit'sanincrediblycomplexprocess,processingvisualinformationisinherentlynaturaltous.Thehumanbraindividesavisualsignalintomanychannels thatstreamdi! erentkindsofinformationintodi! erentareasofthebrain,suchas anattentionsystemthatidentiesimportantsectionsofanimagewhilesuppressingexaminationofotherareas.Widespreadassociativeinputsfrommusclecontrol sensorsandalltheothersensesallowthebraintodrawoncross-associationsgained fromexperience.Feedbackloopsinthebraingobacktoallstagesofprocessing, includingthehardwaresensorsthemselveswhichmechanicallycontrollightingvia pupildilationandtunethereceptiononthesurfaceoftheretina. Forcomputers,thisisnotanaturalorinherentprocess.Thecomputerreceives agridofnumbers,yethasnoneoftheinherentpatternrecognitionskillsorcrossassociationsthatallowandrenehumanvisualperception.Additionally,anygiven numberinthegridhasalargenoisecomponent.Thus,ourtaskbecomestotransform thenoisygridofnumbersintotheperceptionofanobjectofaspecictype. Formally,thisisimpossible.Neitherauniquenoradenitivesolutionexiststo 5 FromGaryBradskiandAdrianKaehler"LearningOpenCV:ComputerVisionwiththeOpenCV Library"[11] 8

PAGE 14

theproblemofreconstructinga2dimensionalviewofa3dimensionalworldintoits thirddimension.Forinstance,whenweseesomethingcloseandsomethingfar,the closerobjectisrelativelylarger.However,withoutbackgroundknowledge,itisjust asreasonabletoassumethatthecloserobjectisactuallylarger.Anglesposethe sameproblem.Ourabilitytoreconstructthethirddimensionisinformedbyyearsof practiceandcollectingabaseofknowledgeaboutthenuancesofvisualperception. Asidefromthedimensionalityproblem,distortionisanotherproblemacomputer can'taccountforbyitself.Distortioncomesfromavarietyofsourcesincluding environmentalvariance(weather,lighting,reections,movements),mechanicalimperfections,niteintegrationtimeonthesensor(motionblur),electricalnoise,and compressionartifactsproceedingimagecapture.Overtheyearscompensationfor theseissueshavebecomesecondnatureandwehavelearnedtotunethemout.For amachine,thebestwaystocompensatearethroughmachinelearningtechniques thatcorrelatehiddenvariableswiththeirvaluesinalabeledtrainingsetorstatistical methodsthatserveasdiscriminators.Ideally,aperfectlyconstrainedcontextwould fundamentallysimplifytheproblem,howeverthatmodelwouldhavelittletonouse intherealworld.Thisiswherepreprocessingbecomesveryimportant(seeSection 5.2).Normalizationisvitaltocomputervision. 9

PAGE 15

3Theoryofneuralcomputation 6 3.1Anintroductiontoneurocomputing Beforediscussingarticialneuralnetworks,thereareseveralprerequisiteconcepts onemustbefamiliarwithinordertounderstandtheeldofneurocomputing.First, itisnecessarytounderstandenoughneurosciencetocomprehendhowcertainneuroscienticmodelsfunctiontogeneratecertaintypesofapproximations,andinwhich eldstheseapproximationsaremoreandlessaccurate.Second,itisnecessarytobe acquaintedwithenoughfundamentalmathandcodingtounderstandtheimplementationofthesemodels.Next,itisnecessarytounderstandenoughcognitivescience tohavesomeideaaboutwhatthebrainissupposedtodo,anditsmechanismsof functionality.Finally,withthisknowledge,itisnecessarytoassumethatit'spossible tomakemeaningfulsimplicationsofsomeaspectsofthehumannervoussystem. 7 Underthisinherentassumption,neurocomputingismostsimplisticallydened asbrain-likecomputation.Therearemanyparadigmsofcomputingsystemorganization;neurocomputingisanattempttobuildcomputersthataredesignedlike thebrain,inhopesofemulatingit.Brainshavebothstrengthsandweaknesses,like anyothercomputationalmechanism.Humanbrainsarenotall-purposecomputing devices;theyarepowerfulatsometasksandweakatothers.Theirpowerresidesin e! ectivebiologicalpreprocessing,theuseofmemoryinplaceofcomputingpower,and highe"ciencywithasmallnumberofoperations.They'readeptatpatternrecognition,motorcontrol,perception,exibleinference,intuition,andeducatedguessing, 6 FromJamesA.Anderson,"AnIntroductiontoNeuralNetworks"[8] 7 Somewouldarguethatneurobiologyisintrinsicallytoonuancedtosimplifymeaningly,orthat wedon'tknowenoughaboutitsmechanismtomakecorrectgeneralizations.Whilethismayactually betrue,withouttheassumptionneuroscienticmodelsarenotsubstantiable. 10

PAGE 16

Figure1:Aneuron howevertheyareslow,imprecise,prejudiced,makeerroneousgeneralizations,andare usuallyincapableofexplainingtheiractions.Alloftheseproperties,desirableornot, mayalsobetypicalofneurocomputing.Becausebiologicalneuralnetsarebounded bybiologicalconstraints,theyprovideawealthofinformationinthepracticalengineeringandeconomicsenseaswellasthecomputational,embodyinganaturally selectedexampleofoptimizationoveravailableresources. 3.2Biologicalneuralnetworks Biologicalneuralnetworksarepopulationsofinterconnectedneuronswhose inputsandsignalingtargetsdenearecognizablecircuit.Theyarethestructure throughwhichourbrainsprocessinformation.Everyintentionalandinstinctual thoughtandmotionisaresultofinteractionsbetweenanetworkofneurons. Ourbrainscontainroughly100billionneurons.Theirstructurecanbeseenin Figure1.Theinputendofneuronsaremadeupofdendrites,whichbranchout inatree-likemannerfromthecellbody,alsoknownasthesoma.Extendingfrom thesomaisalong,thinprojectionknownastheaxonthetransmissionlineofthe 11

PAGE 17

neuron.Theaxoncangiverisetocollateralbranches,formingavast,interconnected network.Whenaxonsreachtheirendtheybranchagainintoastructureknownas theterminalarborization.Attheendsoftheterminalarborizationaresynapses, comprisingtheoutputendoftheneuron.Dendritesreceiveinputsfromothercells, thesomaprocessestheinputsthentransmitsinformationalongtheaxontothe synapses,whichemitoutputtobereceivedbyotherneuronsvianeurotransmitters di! usedacrossthesynapticclefttheemptyregionbetweenthesynapsesofone neuronandthedendritesofanother. Eachneuronisconnectedtothousandsofotherneurons,allconstantlycommunicatingviaelectrochemicalsignals.Itcontinuouslyreceivessignalsfromtheseother neuronsthensumsuptheinputsaccordingtosomeinternalprocess.Iftheresultis greaterthansomethresholdvalue,theneuronresbygeneratingavoltage,knownas anactionpotential,thattransmitsdowntheaxontotheterminalarborization.This responseisanall-or-nothingbinary;eitherthereisanactionpotentialorthereis not.Proceedingring,aneuronhasbothanabsoluteandarelativerefractoryperiod duringwhichitcannotreagain.Fortheformer,thereexistsnoactionthresholdnor subsequentpossibilityofring,butforthelatter,thethresholdismerelyelevated. Thisrelativerefractoryperiod,alsoknownassynapticresistance,isadaptable,which cancausemodiedor"learned"behavioroftheneuroninwhichringfrequencywill riseifastimulusismaintained.Thisphenomena,rstdiscoveredbyDonaldHebb in1949,isknownasHebb'srule:iftheinputofaneuronisrepeatedlyandpersistentlycausingtheneurontore,ametabolicchangeoccursinthesynapseofthat particularinputtoreduceitsresistance. 8 Thisincreasesthee"ciencyofacommon actionpotentialowthroughaneuralnetwork.Thesevarying,modiableresistances giverisetodetailedinteractionsamongthewebofneuronsthatareparamountto thenatureofcomputationthatneuralnetworksperform. 8 FromDonaldHebb,"Theorganizationofbehavior"[18] 12

PAGE 18

Biologicalpressuresonneuralnetworksenforcetwobasicgroundrules:eachhumanbeginswithroughly100billionneuronsandneuronsmustearntheirexistence ordie.Theyaremetabolicallyexpensive,andthusbiologicalpressuredictatesthat thefewestpossiblemustbeused. Therearethreedi! erenttypesofneurons:sensory,motor,andcentral.Sensory neuronsareactivatedbyphysicalstimuliandsendsignalsthatinformthecentral nervoussystemofthestateofthebodyandtheexternalenvironment.Motorneuronsconnectmusclesande! ectororganstothenervoussystemsothattheycanbe controlled.Centralneuronsformalloftheirinputandoutputconnectionswithother neurons. Interactionsbetweenthesetypesofneuronsarewhatcauseourbrainsandbodiestofunction,distinguishinganimalsfromplants.Theyconstantlyandtirelessly performnotonlyourcognitivefunctionsbutalsoautomaticandunconsciousones, suchasresponsetopainstimulusandmaintainingunconsciousbreathing.Everyone ofouractionsandthoughtsisaresultoflearnedinteractionsbetweenneuronsinan incessantlyowingnetworkspanningourentirebody,generatingourperceptionof theworldanddeterminingourbehavior. 3.3Articialneuralnetworks 9 Anarticialneuralnetwork(ANN)isacomputationalmodelthateitherabstractsorisinspiredbythestructuraland/orfunctionalaspectsofbiologicalneural networks.Itisanetworkofsimpleprocessingelementsthatexhibitscomplexglobal behaviordeterminedbyadaptiveweightedconnectionsbetweenprocessingelements andelementparameters.InanANNneuronsarerepresentedasnodes,andsynapses asweightedconnectionsbetweennodes.Nodesarerepresentedbyanarrayofacti9 AdditionalfactsfromChristopherBishop's"PatternRecognitionandMachineLearning"and ai-junkie.com[10][1] 13

PAGE 19

Figure2:Articialneuron vationvalues,oneforeachnode,andweightsarerepresentedbymatriceswithan entryforeachpossiblepairofnodesbetweentwoconnectedlayers.Nodesreceive input,applysomesortofsummationfunction,thenoutputalongadjacentweighted connectionsaccordingtoanactivationfunctionontheweightedsumoftheinputs,as illustratedinFigure2.Theseactivationfunctionscanbebinary,likebiologicalneuralnetworks,orgraded.Thesenodesandconnectionsformanetworkthatlearnsby meansofanalgorithmthatmodiestheweightsbetweennodeshencethestructure ofthenetworkbasedonexternalorinternalinformationtoproduceadesiredsignal ow.Eventually,acertaininputyieldsacertainknownoutput,meaningthenetwork performssomefunction. WhileANNstypicallyabstractkeycharacteristicsofbiologicalneuralnetworks, therearesomewaysinwhichthebiologicalmodelcanbeimprovedbytechnology (sincecomputersarenotconstrainedtobiologicalpressures).Forinstance,computers arecapableofcreatingandmaintainingasmanyneuronsasneeded(althoughthemost complexANNstodateuseontheorderofhundredsorthousandsofnodesrather than100billion).Additionally,computersperformoperationsveryfastandarenever inaccurateorfatigued(thisdoesnotmeanthatanarticialneuralnetworkcan'tbe incorrectorslow,butthateachindividualcomputationcomprisingtheprocesswon't be).Somepuristsrejecttheomissionofmodeledbiologicalpressures,thinkingthem 14

PAGE 20

tobeavitalandirremovablefactorinthecomputingmechanismofbiologicalneural networks,howevermuchmoreoftentheyareignored.Onthewhole,thetrendof modernsoftwareimplementationsofarticialneuralnetworkshasbeenlargelyto abandontheapproachinspiredbybiologyinfavorofamorepracticalapproach basedonstatisticsandsignalprocessing.Inthispracticalapproach,componentsof articialneuralnetworksareusedaspartsofalargersystemutilizingbothadaptive andnon-adaptiveelements. 3.4Typesofproblemsforwhicharticialneuralnetsarebest suited 10 WolpertandMacready's"NoFreeLunch"theorem 11 (1997)statesthatall optimizationalgorithmsareequivalentwhentheirperformanceisaveragedacrossthe entirecandidateproblemset.Whiletheproofofthistheoremismathematically technicalandnon-intuitive,itisgenerallyacceptedasmathematicalfact. Sinceitisknownthatnotalloptimizationalgorithmsperformequallywellon eachproblem,wecaninferthateachindividualalgorithmmustperformbetteron someproblemsthanonothers.Thisisaconsequenceoftheobservationthatthere canbenogloballyoptimalalgorithmforsolvingoptimizationproblemsduetothe assertionthatonaverage,performanceofallalgorithmsacrossallproblemscomes outtobethesame.Therefore,eachalgorithmhasstrengthsandweaknesses,and individualproblemsarebestsolvedwithcertainalgorithmsoverothers. Thissuggeststhatproblemsexistforwhicharticialneuralnetworksshouldprovideahighlye"cientapproach.However,practicalapplicationsforANNshavebeen slowinpresentingthemselves.Incomputers,hardwarelargelydeterminesthesoftwarethatrunse"cientlyonit.Whileeverycomputerisinherentlyaprogrammable 10 FromJamesA.Anderson,"IntroductiontoNeuralNetworks"[8] 11 FromD.H.WolpertandW.G.Macready,"Nofreelunchtheoremsforoptimization"[28] 15

PAGE 21

electronicdevicethatcanperformbinaryarithmeticandlogicfunctions,thereis muchmoretoourmoderndevices.Computersphysicallysuitedtoadaptationand learninghaveonlybeeninexistenceforthelastdecadeorsonecessarilybeingable tohandlebehavioralchange,changesininternalfunctioning,andpotentialinstability. Thisproblemcanbeperceivedasaparalleltothewayinwhichweandchimpanzees di! er.While99%ofourDNAisthesame,the1%di! erencepredominantlylies ingeneswhichmodifysizesandshapesofbrainstructures. 12 Whilewecontainno radicalnewhardwarecomparedtochimps,rearrangements,inations,contractions, andothermodicationsofthesamephysicalstructurehasmadeourbrainscapable ofconsiderablymorecomputationalpowerandexibility. Itisimportantnottolosesightofthisevolutionaryperspectivewhenconsidering ANNs.Articialneuralnetworks,likebrains,cannotcomputeeverything.Their performanceishighlydetaildependent,arrivingatreasonablesolutionsonlyfora limitedandspecicsetofproblems.Thepracticalapplicationsofneuralnetsarethe sameproblemsselectivepressurehascausedanimalstobiologicallyadapttosolve: association,classication,categorization,generalization,rapidresponse,andquick inferencefrominadequatedata. Inthebrain,functionsareperformedcollectivelyandinparallelratherthanwith acleardelineationofsubtaskstowhichvariousneuronsareassigned.Thisisa fundamentaldistinguishingpropertyofarticialneuralnets,andthereasonwhy theyaresousefulfortime-sensitivetasks. ANNsareuniversalclassiersprimarilyusedtomodelcomplexrelationshipsbetweeninputsandoutputsortondpatternsindata.Themostsuccessfulapplications ofneuralnetstodateare:functionapproximationorregressionanalysisincluding timeseriespredictionandmodeling;classicationincludingpatternrecognition,noveltydetection,andsequentialdecisionmaking;anddataprocessingincludingltering, 12 FromJamesA.Anderson,"IntroductiontoNeuralNetworks"[8] 16

PAGE 22

clustering,blindsignalseparation,andcompression. 3.5Whyanarticialneuralnetworkisagoodtforthisproblem Medicaldiagnosisofmelanomarequiresvisualrecognitionofcancerouslesions, whichfallsunderthethemeofclassicationthroughpatternrecognition(andisa problemcurrentlysolvedbybiologicalneuralnetworks). 17

PAGE 23

Figure3:Gradientdescent 4Articialneuralnetworks 4.1Necessaryconcepts 13 Thereareseveralprerequisiteconceptstounderstandingthedi! erencebetween varioustypesofneuralnets. Therstissupervisedandunsupervisedlearning.Forallneuralnets,atraining setofdataisnecessarytoteachthenetworktheconnectionweightsthatgeneratethe desiredow.Sometimes,theappropriateoutputforagivendatasetisknown,and thatinformationcanbeusedtocalculatethenminimizethediscrepancybetweenthe actualandexpectedoutputbymodifyingthenetworkweightstoyieldthedesired ow.Thisprocessiscalledsupervisedlearning.Sometimes,however,theappropriate outputisnotknown,andinsteadthesystemisrequiredtotaketheinputdataand somehoworganizeitinanappropriateandyetunknownway.Methodstosuitthese needsaremuchmoredi" culttoconstructanduse,yetarecapableofproducing insightfulandreliablesolutionstodi"cultproblems.Thisisknownasunsupervised learning.Thisthesisutilizessupervisedlearning. Thesecondconceptisgradientdescent.Gradientdescentisanoptimization 13 FromChristopherBishop,"PatternRecognitionandMachineLearning"[10] 18

PAGE 24

algorithmthatndsalocalminimumofafunction.Ittakesstepsfromastarting pointproportionaltothenegativeofthegradientofthefunctionatthatpointto descendtowardstheminimumasquicklyaspossible(seeFigure3).Sometimes anapproximateinsteadofexactgradientisusedtofurtherspeedtheprocess.Itis importanttonoticethatgradientdescentdoesnotnecessarilyconvergetotheabsolute minimum.Theminimumitconvergestoisbasedonthevalueatwhichtheprocedure starts.Gradientdescentisbasedontheobservationthatifthemultivariablefunction F ( x ) isdenedanddi! erentiableinaneighborhoodofapoint a ,then F ( x ) decreases fastestifonemovesfrom a inthedirectionofthenegativegradientof F at a !" F ( a ) Itfollowsthat,if b = a F ( a ) forstepsize >0asmallenoughnumber,then F ( a ) # F ( b ) .Itisimportanttochoose correctly,becauseitispossibleforgradient descenttooscillatearoundthecorrectsolution.Thisisdemonstratedby F ( x )= x 2 with a =1 and =1 ,whichoscillatesbetween 1 and 1 .Withthisobservation inmind,onestartswithaguess x 0 foralocalminimumofF,andconsidersthe sequence x 0 x 1 x 2 ,... suchthat x n +1 = x n n F ( x n ) ,n # 0 (notethatthe valueofthestepsize isallowedtochangeateveryiteration).Thus,wehave F ( x 0 ) # F ( x 1 ) # F ( x 2 ) # ,whichshowsthatthesequence ( x n ) convergestoa localminimumgiventhat F isboundedbelow. Thelastandpossiblymostimportantconceptfundamentaltoneuralnets(andthe generaleldofmachinelearning)isBayesianprobability.PriortodeningBayesian probability,itisimportanttounderstandthedistinctionbetweenafewterms,namely probability,statistics,andlikelihood.Probabilityandstatisticsarerelatedareas ofmathematicswhichconcernthemselveswithanalyzingtherelativefrequencyof events.Probabilitydealswithpredictingthechancesoffutureevents,whilestatistics involvestheanalysisofthefrequencyofpastevents.Moretechnically,probability isprimarilyatheoreticalbranchofmathematics,whichstudiestheconsequencesof mathematicaldenitions,whilestatisticsisprimarilyanappliedbranchofmath19

PAGE 25

ematics,whichtriestomakesenseofobservationsintherealworld.Asstatedby StevenSkiena,"probabilitytheoryenablesustondtheconsequencesofagivenideal world,whilestatisticaltheoryenablesustomeasuretheextenttowhichourworldis ideal." 14 Thelikelihoodofasetofparametervaluesgivensomeobservedoutcomes isequaltotheprobabilityofthoseobservedoutcomesgiventhoseparametervalues. Itisafunctionoftheparametersofastatisticalmodel,andcannotbeconceivedof beforeanoutcomehasbeenobserved. Ingeneral,statisticsareviewedinthe classical or frequentist interpretation,where theyaredenedintermsofthefrequenciesofrandom,repeatableevents.Incontrast, Bayesian statisticsprovideaquanticationofuncertainty,orameasureofastateof knowledge.It'sanextensionoflogicthatenablesreasoningwithuncertainstatements whoseoutcomescannotberepeatedlymeasured.Theunderlyingprobabilitytheory ofBayesianandclassicalstatisticsisthesame.ToevaluatetheBayesianprobability ofahypothesis,somepriorprobabilitymustrstbespeciedwhichisupdatedin lightofnewrelevantdata. Forinstance,wespeakoftheprobabilityofeventssuchastheearthbeingdestroyedbyanasteroidoraspecicindividualdying.Thesearenotrepeatableevents fromwhichwecancalculateaprobabilitythroughratiosinvolvingfrequencies.However,theystillhaveaprobabilityofoccurrencethatcanbeinferredthrougheducated guessingconsideringrelevantevidence,suchasthefrequencyofasteroidsthathave comecloseandtheimpactithastakentodestroyplanetsofcomparablesize,orthe person'sconsumptionofcarcinogensandparticipationinhazardousactivities.When newevidenceisreceived,itcanbeprocessedandthehypothesisreassessedappropriately.Bayesianprobabilityallowsquanticationsofsuchexpressionsofuncertainty. In1946,physicistRichardCoxshowedthatifdegreesofbelief(suchasthoseused inBayesianprobability)arenumericallyquantied,thenasimplesetofaxiomswhich 14 FromStevenSkiena,"CalculatedBets:Computers,Gambling,andMathematicalModelingto Win"[25] 20

PAGE 26

encodecommon-sensepropertiesofthesebeliefscanbeusedtointuitivelyconstruct asetofrulesformanipulatingdegreesofbeliefthatareequivalenttothesumand productrulesofclassicalstatistics.In2003,statisticaltheoristEdwinThompson JaynesusedCox'saxiomstoprovethatprobabilitytheorycouldbeconceivedasan extensionofBooleanlogictosituationsinvolvinguncertainty.Overtheyears,many mathematicianshaveproposedvarioussetsofaxiomssimilartoCox'sthatallhave behavedpreciselyaccordingtotherulespreviouslyoutlinedforclassicalstatistics. ThecentralideaofBayesianprobabilityisdenedinBayes'theorem: p ( w | D )= p ( D | w ) p ( w ) p ( D ) where D isthesetofobserveddataand w istheeventwhoseuncertaintywe wishtonumericallyexpress.Thisequationexpressestheuncertaintyoftheevent w dependingonthesetofobserveddata D .Theconditionalprobability p ( D | w ) known asthelikelihoodfunctionstandsfortheprobabilityofthegivenobserveddata D iftheevent w isknowntooccur.Thepriorprobabilitydistribution p ( w ) expresses theuncertaintyof w beforethedata D istakenintoaccount,andismostoftena purelysubjectiveassessmentbysomeonewithexperience.Theproductofthesetwo probabilitiesisscaledtoensurethatthefunctionintegratestooneandisthereforea validprobabilitydensity.Inotherwords,theposteriorprobabilityisproportionalto thelikelihoodtimesthepriorprobabilitydistribution,whereallofthesequantities areviewedasfunctionsof w AnadvantageBayesianprobabilityholdsoverclassicalprobabilityisthatthe inclusionofpriorknowledgearisesnaturally.Forinstance,classicalstatisticswould rejectthehypothesisthatacoinisfairifitlandsontails90outof100timesin anobserveddataset,whereasBayesianstatisticscouldincludeapriorprobability distributionreectingasetofobserveddatawherethecoinlandedontails50times of100. 21

PAGE 27

Figure4:Theperceptronfunction Asevidencedabove,theprobabilityBayes'theoremassignsishighlydependent onthepriordistributionused,whichisaconsiderableweakness.Bayesianmethods basedonpoorlychosenpriordistributionscanyieldpoorresultswithhighcondence. Despitethis,Bayesianprinciplesarestillverypopularinpatternrecognitionand othermachinelearningparadigms,astheabilitytoincorporateevidenceinrealtime ishugelyadvantageous. 4.2Theperceptron 15 Theperceptronwastherstinuentialmodelofasimplearticialneuralnetwork,proposedbypsychologistFrankRosenblattin1958.Itisthesimplestkindof feedforward(nocyclesorloops,theconnectionsonlymoveforward)neuralnetwork, functioningasalinearclassier.Itworksonlywhentwopatternsarelinearlyseparable,meaningahyperplanecanbedrawnintheproblemspaceinsuchawaythat completelyseparatesthetwoclasses. Theperceptronfunctionisdenedas: f ( x )= # $ 1 w $ x + b> 0 0 otherwise where x istheinputvector, f ( x ) istheoutputvector, w istheweightvector, 15 FromChristopherBishop,"PatternRecognitionandMachineLearning"[10] 22

PAGE 28

and b isthebias.Thebiasisaconstantvaluethatservestoaltertheposition(not orientation)ofthedecisionboundarysothatthethresholdvalueis0.Thisfunction isabinaryclassierwhere1indicatesapositivecase(ononesideofthehyperplane) and0representsanegative(ontheothersideofthehyperplane). Theperceptron'smethodoflearningisdescribedbytheperceptronlearningalgorithm(Algorithm1onthenextpage).Itusesknowledgeofpastresultstomodifythe weightsoftheconnections,therebyimprovingtheperformanceofthenetwork.This isconceptuallysimilartoBayesianstatistics,inwhichlikelihoodscanbeupdatedto reectnewdata. Notethattheweightsareadaptedimmediatelyforeachpairinsteadofatonce fortheentirevector,andalsothatweightsareonlyupdatedifthereiserror,i.e.only whenlearningwilloccur.Theperceptronconvergencetheoremstatesthatifthere existsanexactsolution,theperceptronlearningalgorithmisguaranteedtolearnthe classicationinanitenumberofsteps.Ifthereisnotanexactsolutionandthe trainingdataisnotlinearlyseparable,thealgorithmwillrunindenitelywithout converging.Unfortunately,manyinterestingclassicationsarenotlinearlyseparable andrequireamorecomplexdecisionsurfacetoseparatetheclasses,soperceptrons donotsu" ce. Theperceptronhasaninputsensorylayercalledaretina,whichispartiallyconnectedtoanassociationlayer.Itisimportanttheoreticallythattheselayersarenot fullyconnected;thisestablishesthateachunitintheassociationlayercomputesa di! erentfunctionoftheimageontheretina.Theassociationlayeristhenreciprocallyconnectedtoaresponselayerbywayofunidirectionallymodiableweighted "synapses"(inthedirectionoftheresponselayer).Thegoaloftheperceptronisthe activationofasingleappropriateunitintheresponselayerforacertainsetofinputs. Theactivationofanyappropriateunitisacandidatesolution. Thebasiccomputingelementoftheperceptronisanarticialneuronknownas 23

PAGE 29

Algorithm1 PerceptronLearningAlgorithm Input: Givenatrainingset D = { ( x 1 ,d 1 ) ,..., ( x n ,d n ) } Output: Trainedmodel y = f ( x ) Algorithm: INITIALIZEtheweightsandthethreshold set w 0 = b set x j, 0 =1 set w i (0)= smallrandomvalues TRAINthenetwork foreachsamplejinD: calculatetheactualoutput y j ( t )= f ( w ( t ) x j )= f ( w 0 ( t ) x j, 0 + ... + w n ( t ) x j,n ) adapttheweights foreachnodeiinn: w i ( t +1)= w i ( t )+ ( d j y j ( t )) x j,i repeatuntiliterationerror d j y j ( t ) < y = f ( x ) istheperceptron'soutputforaninputvector x b isthebias D = { ( x 1 ,d 1 ) ,..., ( x n ,d n ) } isthetrainingsetwhere x j istheinputvector,and x j,i isthevalueofthe i th nodeofthe j th input vector d j isthedesiredoutputvalueof x j w istheweightvector w i isthe i th valueoftheweightvector,tobemultipliedby x j,i w i ( t ) isthe i th weightattime t isthelearningrate,where 0 < % 1 (alearningratesettoohighwill periodicallyoscillatearoundthevalueofthesolution) isauser-speciederrorthreshold 24

PAGE 30

Figure5:Multi-layerperceptron the thresholdlogicunit ,orTLU,comprisedof n inputweightswithstrengths w [ i ] asshowninFigure4.Itsumstheproductoftheinputweightsandtheirrespective strengthsandtransformsthemintobinaryvaluesindicatingactivationifthescaled summedinputisoveracertainthreshold.Thereisnogradedactivation. Althoughitslearningcapabilitiesareextremelylimited,theperceptronprovedto beavaluabletoolinpredictionandexplanationofhumancognition.Interestingly, thelimitationsthatmakeitlesse! ectiveasacomputingdeviceareverysimilarto thoseofhumans,andassuchitwaswidelyusedamongpsychologistsandcognitive scientiststohelpformmanytheoriesregardingtheworkingsofthebrain.Along withtheADALINE(anearlygradientdescentalgorithmusinganon-binaryerror function),theperceptronframedthenetworklearningprobleminawaythatwas fundamentallyacceptedandintegratedintoscienticthought,shapingtheevolution ofcomputationalscienceandfuturenetworkmodelsforyears. 4.3Themulti-layerperceptron 16 Themulti-layerperceptron(MLP)isafeed-forwardarticialneuralnetworkmodel consistingofmultiplefullyconnectedlayersofnodesinadirectedgraphthatmaps setsofinputdataontoasetofappropriateoutput.Itisproventobeauniversal 16 FactsfromJamesA.Anderson's"AnIntroductiontoNeuralNetworks"andDeepLearning.net [8][4] 25

PAGE 31

functionapproximatorbythe universalapproximationtheorem ,otherwiseknownas the Cybenkotheorem MLPscontainmultiplestagesofprocessing;comprisedofaninputlayer,anoutputlayer,andatleastonelatent(hidden)layer(referredtoassuchbecausetheir functionisobscuredfromsight).Thedi! erencebetweenperceptronsandmulti-layer perceptronsliesinthefullconnectednessandtheintermediateunits'useofcontinuoussigmoidalnonlinearitiesasactivationfunctionsinsteadofstepfunctions, not in thenumberoflayers,asthenamesuggests.Thistypeofprocessingmeansthatthe neuralnetworkfunctionisdi! erentiablewithrespecttothenetworkparameters,a factwhichallowsthenetworktobetrainedthroughbackpropagation.Ifthelatent unitsusedlinearactivationfunctions,thenetworkcouldbesimpliedtoaninput, association,andoutputlayer,i.e.aperceptron(becausethecompositionofsuccessive lineartransformationsisitselfalineartransformation). EachnodeofanMLPhasanon-linear(sigmoidal)activationfunction,meantto modelthefrequencyofactionpotentialsofbiologicalneurons,insteadofthesimpliedbinaryfunction.Thetwomostcommonactivationfunctionsarethehyperbolictangentfunction ( y i )=tanh( v i ) withrange ( 1 1) ,andthelogisticfunction ( y i )=(1+ e v i ) 1 withrange (0 1) ,where y i istheoutputofthe i th nodeand v i is theweightedsumoftheinputsynapses. MLPsusebackpropagation,asupervisedlearningtechnique,totrainthenetwork. Backpropagationisane" cienttechniqueforevaluatingthegradientofanerrorfunctionforafeed-forwardneuralnetsothatgradientdescentcanbeused.Itcanbe perceivedasalocal"messagepassing"schemeinwhichinformationissentbothforwardsandbackwardsthroughthenetwork.Backpropagationisdependentonthe activationfunctionbeingdi! erentiable.Learninginthenetworkoccursbymodifying theconnectionweightsbyafactorcalculatedthroughgradientdescentafterapiece ofdataisprocessed,sothatthechangeisbasedontheerrorgeneratedcomparedto 26

PAGE 32

theexpectedresult. Therstphaseofnetworktrainingis[forward]propagation.Thetrainingdatais propagatedforwardthroughthenetwork,generatingoutputactivationsbyapplying theselectedsigmoidalfunctiontotheweightedsumoftheinputs.Theinputvector x isappliedtothenetworkandpropagatedforwardusing a j = % i w ji z i and z i = h ( a i ) where z istheactivation, w istheweight, h isthenonlinearactivationfunction, and a istheweightedsumoftheinputs.Theerror d k isthencalculatedforallthe outputunitsusing d k = y k t k where y k istheactualoutput(alinearcombination oftheinputvariables)ofthe k th unitof x ,and t k isitstargetoutput.Afterthis, backwardpropagationoftheoutputactivationsthroughtheANNoccursusingthe expectedoutputvectorasastartingpoint.Thealgorithmbackpropagatesusing d j = h ( a j ) % k w kj d k toobtaintheerrorforeachlatentunitinthenetwork.The valueof d foraparticularlatentunitcanbeobtainedbypropagatingtheprevious valuesof d backwardsfromunitshigherupinthenetwork.Anexampleofthisprocess canbeseeninFigure10onpage46.Becausewealreadyknowtheerrorvaluesfor theoutputunits,wecanevaluatethemforallofthelatentunitsinafeed-forward networkbyrecursivelyapplyingtheequationabove.Theequation dE dw ij = d j z i isused toevaluatetherequiredderivativeswhere E isthesummederrorfunction, w ij isthe weight, z i istheactivationofunit i and d j istheerror. Thesecondphaseoftrainingisweightupdateusingtheerrorvalues d k calculated inphaseone.Foreachweightedsynapseyoumustrstmultiply d k andtheinput activationtogetthegradientoftheweight,thenupdatetheweightintheopposite directionofthegradientbysubtractingaratioofitfromtheoriginalweight.This ratioisdeterminedbyafactorcalledthelearningrate.Thelearningrateappliesa greaterorlesserportionoftheweightcalculatedbythepastiterationtoeachnew adjustment.Itisanalagoustothestepsizeparameterofgradientdescent.Cheshire 27

PAGE 33

Algorithm2 Multi-layerperceptronlearning Input: Asetoftrainingvectorsandstoppingcriterion(numberofiterationsor desiredaccuracy) Output: Atrainedmodel Algorithm: Initializetheweightsinthenetwork Whilestoppingcriterionnotsatisfied Foreachinputvector x Applythevector x tothenetworkandforwardpropagatethrough thenetworktofindtheactivationsofallthelatentandoutput units a j = % i w ji z i z i = h ( a i ) Evaluatetheerrorsforalltheoutputunits d k = y k t k Backpropagatetheerrorstoobtainerrorsforeachlatentunitin thenetwork d j = h ( a j ) % k w kj d k Evaluatetherequiredderivatives E = 1 2 % k d 2 k dE dw ji = d j z i Updateweightsinthenegativedirectionofthederivatives a j istheweightedsumoftheinputsofunit j z i istheactivationofaunitthatsendsaconnectiontounit j w ji istheweightassociatedwiththatconnection h isthenonlinearactivationfunction d k istheerrorassociatedwiththeoutputunits y k isthenetworkoutputvalue,alinearcombinationofinputvariables t k isavaluespeciedbytheerrorfunction E n isthesummederrorfunction 28

PAGE 34

EngineeringCorporation'swebpage 17 describesalargelearningrateaspunishinga childwithaspankingitishighlye! ective,howeverinappropriateandcounterproductiveiftheo! enseisveryminor.Ifthelearningrateistoohigh,thesystemcan divergeoroscillatearoundthetruesolution,andifit'stoolowconvergencewillbe veryslow.Valueshaverange(0,1].Anoptionalfactorthatcana! ectweightupdate isthemomentumfactor.Themomentumfactorallowsachangetotheweightstopersistforanumberofadjustmentiterations,thussmoothingoutunusualconditionsin thetrainingset.It'susedtopreventthesystemfromconvergingtoalocalminimum orsaddlepoint,intendinginsteadtondtheabsoluteminimum.Ifthemomentumis settoohigh,thesystemcanovershoottheminimumandeliminatenecessaryvariance inthedataset.Ifitissettoolow,thesystemismorelikelytoconvergetoalocal minimum.Valueshaverange[0, & ). Phasesoneandtwoarerepeateduntilthedesiredaccuracyisachieved,orfora setnumberofiterations.Ifthesetnumberofiterationsistoolow,thesystemmight nothaveachancetofullyconverge.Ifit'ssettoohigh,thesystemwillconvergeand runpointlesslyforthespeciciedduration.Thiscanbeavoidedbycombiningahigh numberofiterationswithadesiredaccuracy,sothattrainingterminatesifeitheris reached. Theperformanceofaneuralnetdependssignicantlyonitsnumberofhidden layers.Thehigherthenumberofhiddenlayers,themorenuancethenetworkcan model,howevertoomanylayersoftenleadstoovertraining.Overtrainingisaphenomenonthatoccurswhenthenetworkmodelsitstrainingsettooperfectly,including insignicantdetailsthatdonotaidinclassication.Whenthisoccurs,thenetwork ceasestofunctiononanynewdata. ThemainpracticallimitationtotheperformanceofMLPsisslowconvergencethat isnotguaranteed.Becausegradientdescentmayconvergetoanylocalminimumon 17 Locatedathttp://www.cheshireeng.com/[9] 29

PAGE 35

theerrorsurface(a p dimensionalobjectina p +1 dimensionalspace)depending onthestartingvalues,theMLPmaynotreachtheabsoluteminimum.Despite this,multilayerperceptronsemployingabackpropagationalgorithmarethestandard algorithmforanysupervisedlearningpatternrecognitionprocess.Theyareusefulin researchintermsoftheirabilitytosolveproblemsstochastically,whichoftenallows approximatesolutionsforextremelycomplexproblemstobecalculated. 30

PAGE 36

5Consideredandselectedapproaches 5.1Questionsofintent Priortoanysoftwareundertaking,itisnecessarytodetermineanswerstoa fewkeyquestionsthatwillshapetheapproachtothesolution.Eachchosendirection leadstoanewsetofquestions,creatingacandidatemapalongthemanyroadsto resolution. Inregardtothisthesis,therstsuchdecisionwaswhichapproachshouldbe takentowardstheproblemofmedicalimagerecognitionandwhatthefocusshould be.Thereweremanypossibledirections:afocusonresearch,afocusoncomparative performancesofmethods,afocusonafunctionalproduct,afocusonanoptimized product,oranycombinationofthese.Eachapproachhaditsmeritsanddisadvantages.Forinstance,whileresearchingdi! erentalgorithmicapproachesandcomparing theirpastperformancesinotherstudieswouldsupplementgeneralknowledgeofalgorithmsandmachinelearning,itwouldn'timproveprogrammingskills.Contrastingly, generatingafunctionalproductwouldprovideaplethoraofpracticalexperience,but thesubsequentknowledgeofcomputervisionandmachinelearningtechniqueswould beselectivelydeepratherthanbroad.Intheend,practicalknowledgetrumpedtheoreticalfortworeasons:anishedproductisbenecialasadisplayofabilitiesfor futureemploymentopportunities,andalsoasapotentialdiagnostictoolforaserious disease. Thisverdictledtoanotherfork:whethertowritetheprogramfromscratchor makeuseofopensourcecode.Programmingfromthegroundupcouldresultina uniquelibrarygearedtowardscomputervisionandmachinelearningaswellasavery thoroughunderstandingofeachintricacyoftheprocess;concludingwithawealth ofexperienceinalgorithmsandsoftwaredevelopment.However,creatingastrongly 31

PAGE 37

associativeclassierfordataascomplexasskinlesionimageswouldpresumablyrequiremorethantwosemesters.Theotheroption,utilizingpreexistingsoftwareand librariesasbuildingblocksandexpandingontheproductsofthosewhohavesuccessfullyconqueredaspectsoftheproblem,actuallyposesthesamedesignchallengesas independentcodingandrequiresacomparablelevelofcomprehension,whileatthe sametimebeingmuchmoreapplicabletotheeldofpracticalsoftwaredevelopment. Inacompanyvyingforprot,timeisvaluableandnoneiswastedrewritingexistingcode.Thetimesavedcanbeputtouserevisingandoptimizing,ortobeinga stepaheadofthecompetition.Notonlyisreusingcodehighlyadvantageousinthis situationasingleprogrammerwithacomplexproblembutemployersaremore likelytobeimpressedbyresultsthandetermination.Additionally,identicationof melanomaisarelevantproblemandtheproposedsoftwarecouldhavepresentand propitiousapplications.Obviously,itisbothmoreresponsibleandreasonableto followtherouteoffunctionalityandmarketability. Withthatsettled,thenexttaskwouldusuallybetodeterminethedesiredalgorithmicapproach.Inthiscase,theoriginalintentwastoexplorearticialneural networks,sothiswasnotanissue.Asstatedpreviously,articialneuralnetworks functionverywellonclassicationproblems.However,otherclassication-oriented machinelearningalgorithmsexist;amongthemarequadraticclassiers,decision trees,Bayesiannetworks,andhiddenMarkovmodels. 5.2Preprocessingtechniques Whileneuralnetworksareexibleintheirfunctionality,passingthemacomplete,unalteredimageisunlikelytobethebestoption.Typically,preprocessing techniquesareusedtoconverttheimageintoadatarepresentationofitskeyfeatures thatispassedtothenetwork.Therearecountlesswaysofapproachingthisproblem. Regardlessofmethod,therststepofpreprocessinganimageisnormalization. 32

PAGE 38

Theimagemustbecenteredandscaledtoanappropriatesizeandlevelofdetail (ideallyuniformlyrelativeacrossthedataset)whileretaininginformationabout theabsolutesize,lighting,andcolor.Usually,thisinvolvesrstcalculatingdata thattheprocessofnormalizationalters,thencroppingandmagnifyingtheimageto facilitatefeatureextraction.Thenextstepistoselectthefeaturesyouwishtoextract. Therearemanyestablishedparadigmsoffeatureextractionwithintheeldofmedical imagerecognition;thesecanbeusedaseithersetrulesorguidelines.Afterselecting thefeatures,theymustbeextracted.Therearevariouswaysofrepresentingthe extracteddatatothenetwork.Themostcommonmethodisabinaryvectorwhere eachinputrepresentseitherthepresenceorabsenceofafeature.Whileassigning thesenuancedvariablestoabinarydoesnotaccountforthevariabilityoffeature representationandmaydiminishaccuracyforasmallsetofselectivefeatures,most successfulmethodsemployalargesetofselectors;thusreducingtheamountofloss. Also,binaryresponsesareinherentlycharacteristicofneuralnetworks,andtheyare excellentclassiersinspiteoftheapparentlackofnuance. 5.3Selectionoffeatures 18 Beforefeatureextractioncanoccur,aschemaoffeatureselectionmustbe determined.Featuresselectedforextractionmustbequantiablymeasurableand representative,withbothahighsensitivityandspecicity.Manyestablishedschema forfeatureselectionofskinlesionimagesalreadyexist.Mostoftheseschemanumericallyassignvaluesforcertaindermoscopiccriteria,fallingintothecategoriesoflocal andglobalfeatures.Globalfeaturesallowanexpeditivepreliminarycategorization ofaskinlesion,followingwhichlocalfeaturesareusedtoassessthelesionwithdetail.Localfeaturescanbethoughtofasthe"lettersofthedermoscopicalphabet." Theyconsistofpigmentnetworks,dotsandglobules,streaks,ablue-whitishveil,pig18 Fromdermoscopy.org[5] 33

PAGE 39

mentation,hypopigmentation,regressionstructures,andvascularstructures.Global features,composedbylocalones,arereticular,globular,cobblestone,homogeneous, starburst,parallel,multicomponent,lacunar,andunspecicpatterns. Pigmentnetworksresembleathingridsuperimposedoverthelesion'sbackground. Dotsandglobulesarethenamesgiventoirregularroundedstructuresinvarying colorsandsizes.Streaksarelinearstructuresofvaryingthickness,whichmaybe observedthroughoutthelesionbutarenotclearlycombinedwithpigmentnetwork lines.Ablue-whitishveilisahazyanddi! usepigmentationoverlaidonotherstructures.Pigmentationreferstodarkareaswhichprecluderecognitionofmoresubtle dermoscopicfeatures.Hypopigmentationreferstoanareaofdecreasedpigmentation withinanotherwiseordinarypigmentedlesion.Regressionstructuresareareasof white,blue,orbothresemblingeitherascar,orspeckledpeppering.Theyaresometimesmistakenforablue-greyveilwhenthepepperingisnotthoroughlydistinctive. Thereareseventypesofvascularstructures:commavessels,wreathvessels,arborizingvessels,hairpinvessels,dottedvessels,linearirregularvessels,andvesselswithin regressionstructures.Theybecomemorereadilyapparentwhenthetumorisvery lightlycompressed. Reticularpatternsarethemostcommoninmelanocyticlesions,characterizedbya pigmentnetworkcoveringmostofthesurfaceofthelesion.Globularpatternsconsist ofnumerousirregularlydistributedglobules.Cobblestonepatternsareverysimilar, buttheglobulesarelargerandcloselyaggregatedresemblingcobblestones.Homogeneouspatternsappearintheabsenceofotherfeaturesasdi! usepigmentationranging frombrowntoblue-greytoreddish-black.Starburstpatternsappearattheedgesof skinlesionsandarecomprisedofpigmentedstreaksinaradialarrangement.Parallel patternsareexclusivetolesionsinglabrous(hairless)regionsandarecharacterized byaseriesofparallelpigmentedstreaks.Multicomponentpatternisthenamegiven tocombinationsofthreeormoredistinctivedermoscopicstructuresinagivenregion. 34

PAGE 40

Lacunarpatternsareaggregationsofredlacunasroundedstructureswithsmooth bordersineitherred,blue-purple,orblack.Whenastructurecannotbecategorized intoanyoftheaforementionedpatterns,itisdeemed"unspecic." Inadditiontolocalandglobalfeatures,asymmetry,border,andcoloraresignicantdermoscopiccriteria. Threeofthemostwell-establishedandbenchmarkedmethodsforfeatureselection aretheABCDrule,Menzies'method,andtheseven-pointchecklist. 5.3.1ABCDrule(Stolzmethod) 19 TheABCDmethodisanalgorithmicanalogueoftheheuristicusedbydermatologiststodistinguishmelanomafrommelanocyticlesions.Liketheheuristic,it isanacronyminwhichtheAstandsforasymmetry,theBforborder,Cforcolor, andDfordi! erentialstructures.ItwasdevelopedbyW.Stolzandco.in1994, anditsreliabilitywasprovenbyF.Nachbaretal.thesameyear.Twoindependent dermatopathologistsexaminedandscoredbyhand172melanocyticlesionsconsistingof69melanomasand103melanocyticnevi,yieldingasensitivityof92.8%anda specicityof90.3%.Themethodiseasilylearnedandrapidlycalculated,providing areliable,objective,andreproduciblepreliminarydiagnosisofmelanoma. TheprocedureofStolz'smethodistosemiquantitativelyassesseachofthefour aforementionedcategoriesandassignthemanumericscore.Eachscoreisscaledby aweightfactorindicatingempiricallydeterminedimportance,thenthesumofthese weightedscoresyieldsthetotaldermatoscopyscore,orTDS.Multivariateanalysis yieldedtheweights TDS =1 3 A +0 1 B +0 5 C +0 5 D 19 FactsfromW.StolzandF.Nachbar,"TheABCDruleofdermatoscopy:Highprospectivevalue inthediagnosisofdoubtfulmelanocyticlesions"[21] 35

PAGE 41

Imageobtainedfromdermoscopy.org[5] Figure6:Stolz'sABCDruleforidentifyingmelanoma ATDSoflessthan4.75isconsideredbenign,between4.8and5.45suspicious,and greaterthan5.45highlysuspicious. Asymmetryiscalculatedbybisectingthelesionwithtwooptimallypositioned perpendicularaxes.Nopointsareawardedifthelesionissymmetricoverbothaxes, oneisawardedifitissymmetricoveroneaxis,andtwoareawardedifneitheraxis yieldssymmetry.Symmetryisassessedconsideringcolorsandstructuresinaddition toshape.At54%oftheTDS,thisisthemosthighlyweightedcriterion;themajority ofmelanomasreceiveascoreoftwocomparedtoonly25%ofbenignlesions. Theborderscorerangesfromzerotoeight,whereeachpointrepresentsasharp anddistinctcut-o!inaquarterradian(45 )"piepiece."Thisistheweakestcriterion, accountingforonly4%oftheTDS.Themajorityofmelanomasscorebetweenthree andeightpoints,whereasmostbenignlesionsscorebelowthree. Onecolorpointisawardedforeachcolorpresentinthelesion.Thepalateconsists oflightbrown,darkbrown,red,blue-grey,black,andwhite(lighterthanadjacent skin).Thusthescorerangesfromonetosix,wheremostmelanomasscorethreeor more,and40%scorevetosix.Coloraccountsfor21%oftheTDS. 36

PAGE 42

ThenalcriterionofStolz'smethodisthepresenceofdi! erentialstructures.The vecandidatesareapigmentnetwork,streaks(twoormore),dots(twoormore), globules(oneormore),andstructurelessorhomogeneousareas(covering10%or moreofthelesion).Apointisawardedforeachclearlyvisiblestructure,ranging fromonetove.Likecolor,di! erentialstructuresaccountfor21%oftheTDS. 5.3.2Menzies'method 20 Menzies'method,formulatedbyMenziesin1996,isasimpleequation, M = p n where M isMenzies'score, p standsforthenumberofpositivefeatures,and n stands forthenumberofnegative.Thehigherthescore,thegreaterthechanceofmelanoma. Thesetofnegativefeaturesconsistofsymmetryandasingleevencolor.Thesetof positivefeaturesconsistofablue-whitishveil,browndots,pseudopods,radicalsteaming,scar-likedepigmentation,vetosixcolors,blueorgreydots,andabroadened network.Possiblescoresrangefromnegativetwotoeight. Menzies'methodisnotaverystrongclassierandisonlyusedasaverypreliminarytestindicatingwhetherfurtherdiagnosticactionshouldbetaken. 5.3.3Seven-pointchecklist 21 ProposedbyG.Argenzianoin1998,theseven-pointchecklistisalistof sevenfeaturesselectedfortheirhighcorrelationwithinstancesofmelanoma.Three featuresareconsidered"major,"receivingtwopoints,andfourareconsidered"minor," receivingone.Thescoresforeachfeaturearesummed,andifthescoreisgreater 20 Fromdermoscopy.org[5] 21 Argenzianoet.al,"Epiluminescencemicroscopyforthediagnosisofdoubtfulmelanocyticskin lesions.ComparisonoftheABCDruleofdermatoscopyandanew7-pointchecklistbasedonpattern analysis."[14] 37

PAGE 43

thanthreethelikelihoodofmelanomaishigh.Argenzianofoundthatusingthis method,dermatologistswereabletoclassifymelanomawithasensitivityof95%and aspecicityof75%,correctlydiagnosing82%ofmelanocyticlesions. Themajorcriteriaconsistofanatypicalpigmentnetwork,ablue-whitishveil,and anatypicalvascularpattern.Theminorcriteriaconsistofirregularstreaks,irregular pigmentation,irregulardotsand/orglobules,andregressionstructures. 5.3.4Formulatedmethodusedinthisthesis 22 Obviously,thepointofanapproachutilizinganarticialneuralnetisto establishanewmethodfordiagnosingmelanoma.Theseapproacheswereuseful asaguidelineandhelpfulinsuggestingbenchmarksagainstwhichtocomparethe network'sperformance.Unliketheseapproaches,however,theextractionprocessis performedbyamachineinsteadofatrainedprofessionalhumaneye,whichdrastically decreasesitsaccuracy.Preprocessingextractsinformationaboutfeaturespresentin thelesion,afterwhichthenetworkisexpectedtoreachthesameconclusionaboutthe representativenessofcertainfeaturesoverothers,ifnotthesameoverallperformance. Aftercarefulconsiderationofsomestronglycorrelatedfeaturesandalgorithms throughwhichtoextracteach,itwasdeterminedthatthefeaturevectorshould includeveentriesrepresentingstrongcornerspresentintheedge,veentriesrepresentingthepresenceofdi! erentcolorsonthesurface,andonerepresentingthe presenceofdi! erentialfeatures.Thereasoningisthatthisapproach,whileunique, isderivativeofStolz'sABCDmethod. TherstpartofthetechniqueistakendirectlyfromStolz'smethod.Avectoris returnedindicatingwitheitheraoneorazeroifthecolorsbrown,black,white,blue, and/orredarepresent.First,alterisappliedtotheimagethatturnsanypixel 22 FactsaboutalgorithmsfromGaryBradskiandAdrianKaehler's"LearningOpenCV:Computer VisionwiththeOpenCVLibrary."[11] 38

PAGE 44

Figure7:Colordetection Figure8:Cornerdetection whiteifitsRGBvalueisnotwithintherangeofthespeciedcolorandblackifitis. Thenadetectionwindowisloopedovertheresultantimageuntilaregionisfound inwhichtheratioofrepresentedpixelstototalpixelsisaboveacertainthreshold. Thepresenceofstrongcornersindicatesboththatthelesionisasymmetric,and thattheedgeissharp.First,theimageisrunthroughabilaterallterwhichreduces noiseandsmoothstheimagewhilepreservingedges.Thelterfunctionsbyreplacingtheintensityvalueofeachpixelinanimagewithaweightedaverageofintensity valuesfromadjacentpixelsinadenedregion.Itpreservesedgesbycalculatingthe weightsusingspeciableradiometric(colorintensity)di! erencesaswellasEuclidean distance.Becauseofthetypicalstrongradiometricdi! erencebetweenalesionand thesurroundingskinandtheweakerdi! erencebetweenstructureswithinalesion,a bilaterallterwitharelativelylargetoleranceforradiometricchangesandamedium sizedwindowreturnsanimageofalesionwhosestructuresareobscuredbutwhose edgesareclearlydiscernible.Afterthis,acornerdetectionalgorithm(OpenCV's 39

PAGE 45

Figure9:Structuredetection GoodFeaturesToTrack())isappliedtotheimage,returningcornerswithbigeigenvaluesascomparedtoaspeciedqualityleveltimesthemaximumeigenvalueina denedregion.Finally,avectorisreturnedwithuptoveonesindicatinghowmany cornersareabovetheprovidedthreshold. Thelastpartofthetechniquedetermineswhetherornotdi! erentialstructures arepresentinthelesion.First,theblurredimagefromthecornerdetectionstepis passedthroughaCannyedgedetectorthattracesaroundthelesion.Next,anellipse isttedaroundtheoutlineoftheimageandtheboundingrectangleoftheellipse isreturned.Afterthat,theimageisconvertedtogreyscaleandthecontrastofthe imageisdoubled.Featuredetectionoccurswhenanalgorithmisrunontheregion insidetheaforementionedboundingrectanglethatdetectsiftheradiometricvariance withinthelesionisaboveacertainthreshold.Thisisbasedontheobservationthat di! erentialstructuresmanifestthemselvesvisuallyasareaswithdi! erentcoloring thanthesurroundingregion. Thisprocesscoversthebasesofasymmetry,border,color,anddi! erentialstructures,justlikeStolz'smethod.Eachimageinthedatasetispassedthroughthis programandreducedtoan11pointfeaturevector,witha12thfeaturecalledthe biasindicatingwhetherornotthetrainingvectorrepresentsamelanoma.Thesetof featurevectorsisthenpassedtothenetworktolearn. 40

PAGE 46

5.4Practicallimitations Unfortunately,classierprogramsarepracticallyboundedbycertainlimitations,thebiggestofwhichareconnectionstolocaldoctors,nancialcapital,computingpower,andman-hours.Theseconstraintslimitpracticableimageacquisition techniques,a! ordablesoftware,andgeneralscopeoftheproject.Withnoconnectionsornancialcapital,limitedcomputingpower,andonlyoneprogrammer,this projectwasseriouslyconstrainedincomparisontootherstudies. Therstproblemwastheacquisitionofatrainingsetofimages.Mostofthe publishedstudiesformachineclassicationofmelanomaobtainedimagesfromalocal dermatologist.Thisoptionwasunavailableforthisprojectsoinsteadfreeonline databaseswereused.Therewereonlytwoapparentfreeoptions 23 .Betweenthe twoofthesesites,101imagesofmelanomaand129benignmelanocyticlesionswere obtainedconsistingverypredominantlyofdysplasticnevi.Sincedatasetsmustbe splitintosubsetsfortraining,testing,andvalidation,thatleftalimitedsupplywhere melanomawerevastlyoverrepresented(inreality,lessthan3%ofmelanocyticlesions aremelanomas,asopposedtothewhopping44%inthisdataset).Additionally,the datarepresentsaverystrongselectionbiastowardsdysplasticnevi,whichstrongly resemblemelanoma,becauseveryfewpeoplegotothedermatologisttoexaminea commonnevus.Biasesintheinputdatareectinneuralnets'classications. Duetotimeconstraintsandhavingonlyasingleprogrammer,itwasimperativeto makeuseofsoftwarethathadalreadybeenwrittenratherthancustomizingalongeach facet.Additionally,duetolackoffunding,noprofessionalsoftwarecouldbeused,only freeware.Ofthisfreeware,someverypromisinglibrariesrecentlybecameinactivedue toconictingretrogradedependencieswithhostprograms,andsomedidnothavea 23 Fromdermquest.organdvase.essex.ac.uk/software/jasmine/moles.html[2][22] 41

PAGE 47

wrapperforafamiliarlanguage.Luckily,opensourcesoftwareisincreasinglypopular, sosu" cientcomputervisionandarticialneuralnetworklibrarieswerenotdi" cult tond.(SeeSection5.5) Perhapsmostimportantly,thepreprocessingtechniqueislimitedtoallowthe programtoruninareasonableamountoftime.Ideally,theparametersateach stepoftheprocesswouldbeindividualizedforeachimage.However,automation ofthetailoringprocessisinandofitselfawholeeldofmachinelearning.This meansthateachimagemustbeprocessedwithauniformsetofvaluesthatmaywork verywellorverypoorly,dependingontheimage(butonaverageworkacceptably). Thisuncertaintyispervasiveoftheentireextractionprocess.Forinstance,inthe blurringstep,ahightoleranceforradiometricchangesmaycompletelyobscurea lesioncomposedofmanyglobuleswithsoftedges.Thecolordetectionalgorithmis capableofmissinganentireregionofacertaincolorduetounluckywindowplacement whereateachstep,thepresenceratioisjustbelowthethreshold.Andincreasingthe contrastinthelaststepcanobscurestructureswhosepixelintensitywassimilarto thebackgroundofthelesion's.Unfortunately,theconstraintsofthisprojectallow neitherthetimenorthecomputingpowertosolvetheseproblems,sotheproject mustrelyonthestrengthoftheselectedfeaturesaswellasthepoweroftheneural nettomitigatethem. Despitethesenumerouslimitations,articialneuralnetworksremainstrongand exibleclassiers,anditremainsfeasibletoproduceahigh-functioningandstronglycorrelatednetworkthatclassiesmelanoma. 42

PAGE 48

5.5Software 5.5.1Computervision OpenCV 24 isanopensourcecomputervisionlibrarywritteninoptimizedC.Its goalisto"provideasimple-to-usecomputervisioninfrastructurethathelpspeoplebuildfairlysophisticatedvisionapplicationsquickly."Itincludes500+functions coveringfactoryproductinspection,medicalimaging,security,userinterfacing,cameracalibration,stereovision,robotics,andmore;containingafullmachinelearning library(MLL)focusedonstatisticalpatternrecognitionandclustering. 5.5.2Articialneuralnetwork TheneuralnetworkusedisamodicationofNeilSchemenauer'sPythonmodule bpnn.py,whichheplacedinthepublicdomainthroughhispersonalwebsite. 25 Itisa simpleandcustomizablemulti-layerperceptronutilizingbackpropagation.Itusesthe logisticfunction f ( x )= 1 1+ e x forthehiddenandoutputactivations.Outputactivationsareequivalenttotheclassication,sincetheyrepresentthepercentlikelihood ofapositivecase. Anetworkisinitiatedwiththethenumberofnodesintheinput,hidden,and outputlayer.Intheinputlayer,thenumberofnodescorrespondstothenumber offeaturesforeachinput,andintheoutputlayeritcorrespondstothenumberof outputvaluesforeachinput.Inthismodel,thereisonlyonehiddenlayerwitha customizablenumberofnodes,andindividualweightedconnectionsbetweeneach nodeintheinputandhiddenlayerandhiddenandoutputlayerwithanon-linear activationfunction.Thiscreatesaowfunctionallysimilartoanetworkofmultiple hiddenlayerswithfewernodesperlayer,howeverwithlesscomputation.Thereare 24 FromGaryBradskiandAdrianKaehler"LearningOpenCV:ComputerVisionwiththeOpenCV Library"[11] 25 Locatedathttp://arctrix.com/nas/[24] 43

PAGE 49

stillmultiplelevelsofnon-linearity(atthehiddenandoutputlevel),whichisnuanced enoughtomodelmostphenomena. Next,thetrainingstepiscalledwiththelistofinputsasamandatoryparameter andnumberofiterations,learningrate,andmomentumfactorasoptionalparameters defaultedto5000,0.5,and0.9respectively.Thelistofinputsisformattedsuchthat eachentryinthelistisalistcontainingbothavectorofinputvalues,andavector ofoutputvalues.Forexample, XOR=[[[0,0],[0]], [[0,1],[1]], [[1,0],[1]], [[1,1],[0]]] encodestheXOR(exclusiveOR)functionwhichreturnsapositiveifeitheroneof theentriesispositive,butnotboth. Thenetworkistrainedoneinputvectoratatime.Therststepoftraining foreachvectorisforwardpropagation,whichconsistsofcalculatingthehiddenand outputactivations.Thisisdoneusingthelogisticfunctionwith x equaltothesum ofalloftheinputstothenodescaledbytheircorrespondingweight.Thenextstep oftrainingisbackpropagation.First,theerrortermsforeachnodeintheoutput layerarecalculatedas a (1 a )( t a ) where a isthenode'sactivationand t isthe targetvalue.Next,theerrortermsforeachnodeinthehiddenlayerarecalculated as a (1 a ) # where a isthenode'sactivationand # isthesumofeachnodeofthe outputlayer'serrortermsscaledbyacorrespondingweight.Afterthiscalculation, theweightsbetweenthehiddenandoutputlayeraremodied,followedbytheweights betweentheinputandhiddenlayer.Sincethelayersarefullyconnected,theweights arestoredinmatriceswitharowforeachnodeintherstlayerandacolumnfor 44

PAGE 50

eachnodeinthesecond,meaningthereisanentryforeverypairofnodes.Foreach weightbetweenthelayers,avalueisadded(itisnotalwayspositive).Thevalueis Nea + M where N isthelearningrate, e istheerrortermoftheparticularoutput node, a istheactivationoftheparticularhiddennode, M isthemomentumfactor, and isthevalueof # a fromthelastadjustmentcycle.Aftertheprovidednumber ofiterations,thenetworkistrained. ReturningtotheexampleoftheXORfunction,Figure10onthenextpageshows thestepsofnetworktrainingfortherstiterationoftherstrow,[[0,0],[0]].Forthe sakeofconvenience,theweightmatricesareinitializedwithallentries0.5insteadof atrandom.Thenumberofnodesinthehiddenlayeristwo. 45

PAGE 51

Figure10:FirststepofXORfunctionnetworktrainingforrstinput 46

PAGE 52

6Finalproduct 6.1Results 6.1.1Projectresults Despitenumerouslimitations,theprojectstillachievedconsiderablesuccess.Its performancecanbeseeninFigure6.1.1. 26 After40trialrunschartingperformance overvarioussettings,itwasdeterminedthatonaverage,aneuralnettrainedwith idealparametervaluesisabletocorrectlyclassifylesions64.48%ofthetime,with averagesensitivity53.22%andaveragespecicity72.82%.Trainingwasperformed witharandom80%ofthelesionsetandtestedwithitscomplement. Theidealparametersare11nodesintheinputlayer,eightnodesinitshidden layer,onenodeintheoutputlayer,alearningrateof0.5,amomentumfactorof 0.9,and5000iterations.Overavarietyofsettings,performancerangedfrom44-78% accuracy,12-86%sensitivity,and54-100%specicity.Theseresultswereinvariably moreaccuratethanrandomguessing(whichwouldyield44%accuracy,with101 melanomaand129benignmelanocyticlesions).Especiallyconsideringthestrong selectionbiasofthedatasettowardsdysplasticnevi,theseresultsareimpressive,if lessthanideal.Everylesionthenetworktrainedonwassuspectedtobeamelanoma byadermatologist.Unfortunately,despiteconsiderablesearching,nostandardized datasetforbenchmarkingperformanceonthisproblemwasmentionedorfound. Therstparametertestedwasthenumberofnodesinthehiddenlayer.For thisstep,thelearningratewasheldconstantat0.5,themomentumat0.9,and thenumberofiterationsat5000.Withasmallnumberofnodes(four),results 26 Accuracyis[truepositives+truenegatives]/[truepositives+falsepositives+truenegatives +falsenegatives],sensitivityis[truepositives]/[truepositives+falsenegatives],specicityis[true negatives]/[truenegatives+falsepositives],positivepredictivepoweris[truepositives]/[truepositives+falsepositives],andnegativepredictivepoweris[truenegatives]/[truenegatives+false negatives] 47

PAGE 53

Figure11:Results 48

PAGE 54

variedwidelydependingontheinitialdistributionofthedata.Oversixruns,total accuracyrangedfrom51-78%,withspecicitiesbetween24-86%andsensitivities between55-70%.Thesenetworkswerenotabletoreachaconsistentlyappropriate levelofnuancewithsofewnodesthroughwhichtocreateafunctionow,although itisworthnotingthatthenetworkwiththebestperformancewastrainedwiththese conditions.Sixnodesperformedstronger,howeverweresimilarlyinconsistentand hadverylowsensitivities(45%onaverage).Eightnodesprovedtobemuchmuch moreconsistent,withaccuracyhangingtightlyaround64%,sensitivityaround53%, andspecicityaround73%.Attennodes,thenetworkbegantoexhibitsignsof overtraining,decliningto61%averageaccuracy,52%averagesensitivity,and73% averagespecicity. Theidealnumberofiterationswaseasytodetermineandrequiredlittletesting. Althoughtheerroralmostinvariablycontinuestodecreasewitheachiteration,it isatanincreasinglyslowrate.Aftertherstseveralhundred,improvementdrops considerably;afterseveralthousanditisontheorderofhundredthsofapercentevery hundrediterations.Fivethousanditerationsraninareasonableamountoftimeand producedstrongresultsnotsignicantlyweakerthan8000. Finally,thelearningrateandmomentumfactorweretested.Whileahigher momentumyieldedhigheraverageaccuracy(68%),averagesensitivitydropped(47%). Increasingthelearningrateincreasedthespecicity(79%),howeverfortheproblemof melanomaidentication,specicityistheleastimportantvalue.Withtheconcurrent dropsinaccuracy(60%)andsensitivity(37%),novariationperformedbetterthan thestandardrespectivevaluesof0.5and0.9. 49

PAGE 55

6.1.2Similarprojectresults Teamsofdoctorsandscientistshaveconductedsimilarprojectswithhigheryet comparablelevelsofsuccess. Inchronologicalorder,therstsuchprojectisFikretErcalet.al's,"Neural networkdiagnosisofmalignantmelanomafromcolorimages"[13].Theirprojectwas remarkablysimilartothisone,exceptthatErcalet.alfocusedmainlyoncolordata (asymmetryandborderdatawereincludedaswell).Ercalet.alusedadatasetof 240lesionscomprisedof120nevi(40ofwhichweredysplastic)and120melanoma. Theirbestresultscamefrom14inputfeatures,sevenhiddenneurons,andoneoutput classication.Theiraverageaccuracywas76.8%,averagesensitivitywas80%,and averagespecicitywas86.3%. Thenextproject,StephanDreiseitlet.al's,"Comparisonofmachinelearning methodsforthediagnosisofpigmentedskinlesions"[15],trainedanarticialneuralnetworkon107-featurevectorscalculatedfrom1619lesions(comprisedof1290 commonnevi,224dysplasticnevi,and105melanoma).Theiraverageaccuracywas 82.88%,averagesensitivity76.86%,andaveragespecicity73.95%. NextisYiDinget.al's,"Acomputerassisteddiagnosissystemformalignant melanomausing3Dskinsurfacetexturefeaturesandarticialneuralnetwork"[16]. YiDinget.alusedonly46lesions;34benignlesionsand12melanomas.Their goalwastocomparetheperformanceofnetworkstrainedon2Dand3Dfeature informationextractedfromthesameimages.Bothnetworkshadasensitivityof 91.7%,howeverthenetworktrainedontwodimensionalfeaturedatahadaspecicity ofonly25.7%comparedto76.4%forthenetworktrainedonthreedimensionalfeature data(accuracystatisticswerenotavailable). IliasMaglogiannisandCharlamposN.Doukas',"OverviewofAdvancedComputer VisionSystemsforSkinLesionsCharacterization"[20]usedadatasetof3707lesions, 50

PAGE 56

comprisedof2598commonnevi,1014dysplasticnevi,and95melanoma.They comparedperformanceofdi! erentalgorithmsontheseimagesusinga31-pointfeature vectorcomprisedofdetailedborder,color,andtexturalfeatures.Theirarticial neuralnetworkranwithalearningrateof0.3andnomomentumfor500iterations. Theresultwasanaverageaccuracyof87%,averagesensitivityof81%,andaverage specicityof86.7%. Mostrecently,PabloG.CavalcantiandJacobScharcanski's,"Automatedprescreeningofpigmentedskinlesionsusingstandardcameras"[12]usedaneuralnetworktrainedondetailedfeaturevectorsmodelingtheABCDrule.Theytrained on152lesions(45neviand107melanoma)andyieldedanaccuracyof75.66%,a sensitivityof75.24%,andaspecicityof80%. Oftheseprofessionalprojects,theaverageaccuracyis80.59%,averagesensitivity 80.96%,andaveragespecicity80.67%.Thefactthattheaveragevaluesofthis project(64.48%,53.22%,and72.82%respectively)arenotashighcanbeprimarily attributedtosmallerand/orlessrepresentativesamplesizesandalessdetailedfeature vector.Thesinglebiggestadvantageoftheseprofessionalprojectsisthattheirdata setscontainedamuchlowerproportionofdysplasticnevitototalbenignlesions.It isworthnotingthatatbest,thisprojectperformedwithrespectivevaluesof77.50%, 85.71%,and68.42%.Givenanidealdatadistribution,thisprojectiscompetitive withsimilaronesofprofessionalcaliber. 6.2Possibleimprovements Althoughtheprojectwassuccessful,thereisalwaysroomforimprovement.Most simply,theperformanceoftheneuralnetwouldbegreatlyimprovedbyalarger,more varied,andmorerepresentativedataset.Theinclusionofcommonnevialongwith 51

PAGE 57

Figure12:TheHaarwavelet dysplasticwoulddrasticallyincreasespecicity,andhavingalowerratioofmelanoma tobenignlesionwouldrendertheclassicationmoreapplicabletotherealworldin whichmelanomasmakeupaverysmallpercentageofalllesions.Ideallythereshould existastandardizedandoptimizeddatabaseonwhichtotrainandtestmachine learningtechniques,sothattheperformanceofallapproachescanbecompared,and thesuccessofanapproachneverhastobecontingentonthedatasetused. Anotherimprovementwouldbeamoredetailedfeaturevector.Thestudiesabove usedfeaturevectorsofupto107points,comparedtomy11.Themoredetailedthe inputdata,themorenuancedapatternthenetworkiscapableofnding. Intwoofthestudiesmentionedabove,theperformanceofdi! erentmachinelearningtechniquesontheproblemofskinlesionclassicationwerecompared.Severalapproacheswereinstitutedwithhigherlevelsofsuccessthanarticialneuralnetworks. ThemostpromisingoftheseapproachesisboostedHaarfeatures. Amongpastmachinelearningapproachestomedicalimagerecognition,themachinelearningtechniqueofboostedHaarfeaturesisrelativelycommon.Haar-like featuresaredigitalimagefeaturesusedintheeldofobjectrecognition,namedfor theirintuitivesimilaritytotheHaarwavelet(seegure12).Awaveletisanonlinear function,collectionsofwhichformbasesthatareusedtoreconstructmorecomplex functionsviaprojection.TheHaarwaveletisthesimplestpossiblewavelet.Aset ofHaarwaveletscanbetransformedintoanyotherwavebyscalingeachindividuallyandsummingtheset.TheuseofHaar-likefeaturescategorizesanimageinto 52

PAGE 58

Figure13:Summedareatable subsectionsbasedonthedi! erencebetweensummedpixelintensitiesinadjacentrectangularregionsataspeciclocationinadetectionwindow.Forinstance,inthefacial recognitionproblem,Haarwaveletsareusedtodistinguisheyes,whicharecommonly shadowed,fromcheeks,whicharecommonlythelightestpartoftheface.Thekey advantageofdetectingaHaar-likefeatureoverothertypesoffeaturesisthecalculationspeed.Ifasummedareatable(alsoknownasanintegratedimage)isused, aHaar-likefeatureofanysizecanbecalculatedinconstanttime.Asummedarea tableisaquickande" cientalgorithmthatstoressummedvaluesofarectangular subsetofagrid,wherethevalueatanypoint(x,y)inthesummedareatableisthe sumoftheentriesfrom(0,0)to(x,y)inclusiveinthedatatable,asseeninFigure 13. SeveralpopularalgorithmsutilizeHaar-likefeatures,suchastheViola-Jonesobjectdetectionframework. 27 Proposedin2001asasolutiontothefacialrecognition problem,itwastherstframeworktoprovidecompetitiveobjectdetectionratesin realtime.Inthedetectionphase,atargetsizedwindowispassedovertheinput image,deningsubsections.Foreachsubsectionoftheimagethepixelintensities aresummed,thuscalculatingtheHaar-likefeature.Thedi! erenceisthencompared againstalearnedthresholdthatdistinguishesobjectsfromnon-objects.Duetoits simplicity,aHaar-likefeatureisonlyaweaklearnerorclassier;itsdetectionqual27 FromPaulViolaandMichaelJones,"RobustReal-timeObjectDetection"[27] 53

PAGE 59

ityisonlyslightlybetterthanrandomguessing.Thereforeeitheralargenumberof Haar-likefeaturesoraboostingalgorithmtostrengthenthemostrepresentativeones arenecessarytodescribeanobjectwithsu"cientaccuracy. 28 Boostingisamachinelearningmeta-algorithmimplementedalongsidesupervised learningalgorithmstotransformasetofweaklearnersintoasinglestronglearner;a classierwhoseperformanceisstronglycorrelatedwiththetrueclassication.Metaalgorithmsareiterativeoptimizationalgorithmsthattrytoimproveacandidatesolution(amemberofthesetofallfeasiblesolutionstothegivenproblem),withregard toagivenmeasureofquality.InReddyandBhuyan'spaper,Haar-likefeaturesare usedinconjunctionwiththeAdaBoostalgorithmtoincreasetheirdetectionpower. TheAdaBoost(adaptiveboosting)algorithmpredisposesclassiersbuiltlaterinthe processinfavorofinstancesmisclassiedbypreviousclassiersbyuseofweights. 29 In eachrun,itgeneratesandcallsanewweakclassier,andupdatesthedistributionof weightstoindicatetheimportanceofspecicdatafortheclassicationbyincreasing theweightsofincorrectlyclassiedexamplesanddecreasingthoseofcorrectlyclassiedones.Thiscausesthenewclassiertofocusonthemoreheavilyweightedformer examples.InReddyandBhuyan'spaper,theimagecategoryistrained,thenthe weightsandweakclassiersarestored.Next,theAdaBoostalgorithmtakesseveral weakclassiers(theHaar-likefeatures)anddevelopsthemintostrongermodelsafter anumberofiterations,atwhichpointthehighlyselectivefeaturesthatminimize theclassicationerrorareextracted.TheAdaBoostalgorithmwastherstboostingalgorithmtogainprominence.Sincethen,LPBoost,TotalBoost,BrownBoost, MadaBoost,LogitBoost,andmorehavehadsuccessaswell. BoostedHaarfeaturescouldbeusedinconjunctionwithanarticialneuralnet 28 FromChandanK.ReddyandFahimaA.Bhuyan,"RetrievalandRankingofBiomedicalImages usingBoostedHaarFeatures"[23] 29 FromYoavFreundandRobertE.Schapire,"AShortIntroductiontoBoosting"[17] 54

PAGE 60

Figure14:Flowofcode forincreasedclassifyingpower.Onecouldbeconstructedforeachfeatureofafeature vector,thentheresultantvectorcouldbeclassiedbyaneuralnetworkinthesame mannerasthisproject.Thiswoulddrasticallyimprovethefeatureextractionprocess, whichisbasictobeginwith,thenadditionallyissettoaverageparametervaluesthat performedwellwithsomeimagesandpoorlywithothers.Insteadofincreasingthe quantityofthefeaturevectorassuggestedabove,thiswouldincreaseitsquality. 55

PAGE 61

7Codeandsampleoutput Theprojectiscomprisedofeightmodules,eachrepresentingaspecictask, specicallydesignedtobeabletobeusedbothindependentlyandinconjunction withothermodules.TheowofdependenciescanbeseeninFigure14.Classify.pyrstcallspreprocess.py,whichusescolor_extract.py,corner_extract.py,and structure_extract.pytocomposeafeaturevectorforeachimage.Corner_extract.py andstructure_extract.pybothrequirebilaterallters,whichareappliedusingimage_blur.py.Next,classify.pysplitsthefeaturevectorsintotrainingandtestingsets, thenpasseseachtoarticial_neural_network.py,wherethenetworkisrsttrained, thentestedontherespectivesets.Finally,stats.pyiscalledtocalculatethestatistics accuracy,sensitivity,specicity,andaverageerror. 56

PAGE 62

Listing1:classify.py import artificial_neural_network import preprocess import stats import random import pickle import pprint pp=pprint.PrettyPrinter() n_mels=101 n_moles=129 train_imgs=[] test_imgs=[] #TRAIN_IMGSANDTEST_IMGSAREARRAYSOFSTRINGSREPRESENTING THEFILEPATHSTOTHEIMAGES for i in range(n_mels): img_str='lesion_images/mel'+str(i+1)+'.jpg' if random.random()>=0.2: train_imgs.append(img_str) else : test_imgs.append(img_str) for i in range(n_moles): img_str='lesion_images/mole'+str(i+1)+'.jpg' if random.random()>=0.2: train_imgs.append(img_str) else : test_imgs.append(img_str) #BEGIN print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print print PREPROCESSINGTHETRAININGSET' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' prep_train=preprocess.Preprocessor(train_imgs) train_lesions=prep_train.process() print '\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print 57

PAGE 63

print TRAININGTHENETWORK' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' ann=artificial_neural_network.ANN(11,8,1) ann.train(train_lesions,5000) print '\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print print PREPROCESSINGTHETESTSET' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' prep_test=preprocess.Preprocessor(test_imgs) test_lesions=prep_test.process() print '\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print print CLASSIFYINGTHETESTSET' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' results=ann.test(test_lesions) pp.pprint(results) print '\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! print print ACCURACYSTATISTICS' print print !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! \n' [counts,sens,spec,acc,err]=stats.accuracy(results) [total,correct,t_mel,c_mel,t_mole,c_mole]=counts print 'Sizeofthetestset:',len(results) print '\nSensitivity=',sens 100,'%' print '\tTotalmelanoma=',t_mel print '\tCorrectlyidentifiedmelanoma=',c_mel print '\nSpecificity=',spec 100,'%' print '\tTotalbenignmelanocyticlesions=',t_mole print '\tCorrectlyidentifiedbenignmelanocyticlesions=', c_mole print '\nAccuracy=',acc 100,'%' print '\tTotallesions=',total print '\tCorrectlyidentifiedlesions=',correct 58

PAGE 64

print '\nAverageerror=',err 100,'%' #SAVETHETRAINEDANN saved_ann=open('trained_ann.obj','w') pickle.dump(ann,saved_ann) #SAVETHETRAINDATA saved_train_lesions=open('train_lesions.obj','w') pickle.dump(train_lesions,saved_train_lesions) #SAVETHETESTDATA saved_test_lesions=open('test_lesions.obj','w') pickle.dump(test_lesions,saved_test_lesions) #SAVETHERESULTS saved_results=open('results.obj','w') pickle.dump(results,saved_results) 59

PAGE 65

Listing2:preprocess.py import cv import color_extract import corner_extract import structure_extract class Preprocessor: def __init__(self,imgpaths): self.imgs=[] for i in imgpaths: img=cv.LoadImageM(i,1) if "mel" in i: classification=1 elif "mole" in i: classification=0 self.imgs.append((img,classification) ) def process(self): results=[] n=1 for i in self.imgs: print "///////PROCESSINGIMAGE",n," ///////" colors=color_extract.getColors(i [0]) corners=corner_extract.getCorners(i [0]) structures=structure_extract. getStructures(i[0]) feature_vector=colors+corners+ structures results.append([feature_vector,[i [1]]]) n+=1 return results 60

PAGE 66

Listing3:articial-neural-network.py import numpyasnp import pprint pp=pprint.PrettyPrinter() class ANN: def __init__(self,ni,nh,no): #NUMBEROFINPUT,HIDDEN,ANDOUTPUTNODES self.ni=ni+1 self.nh=nh self.no=no #ACTIVATIONS self.ai=[1.0] self.ni self.ah=[1.0] self.nh self.ao=[1.0] self.no #WEIGHTMATRICES self.wi=np.random.random((self.ni,self.nh) ) self.wo=np.random.random((self.nh,self.no) ) #LASTCHANGEINWEIGHTSFORMOMENTUM self.ci=np.zeros(shape=(self.ni,self.nh)) self.co=np.zeros(shape=(self.nh,self.no)) def update(self,inputs): if len(inputs)!=self.ni 1: raise ValueError,'Error:incorrect numberofinputs' #INPUTACTIVATIONS for i in range(self.ni 1): self.ai[i]=inputs[i] #HIDDENACTIVATIONS[sum(wi xi bias)] for j in range(self.nh): sigma=0.0 for i in range(self.ni): sigma+=self.ai[i] self.wi[i][j] 61

PAGE 67

self.ah[j]=1.0/(1.0+np.exp( sigma)) #OUTPUTACTIVATIONS[sum(wi xi bias)] for j in range(self.no): sigma=0.0 for i in range(self.nh): sigma+=self.ah[i] self.wo[i][j] self.ao[j]=1.0/(1.0+np.exp( sigma)) return self.ao def backPropagate(self,targets,N,M): if len(targets)!=self.no: raise ValueError,'Error:incorrect numberoftargetvalues' #ERRORTERMSFOROUTPUT output_deltas=np.zeros(self.no) for i in range(self.no): ao=self.ao[i] output_deltas[i]=ao (1 ao) (targets[i] ao) #ERRORTERMSFORHIDDEN hidden_deltas=np.zeros(self.nh) for i in range(self.nh): sigma=0.0 for j in range(self.no): sigma+=output_deltas[j] self.wo[i ][j] hidden_deltas[i]=self.ah[i] (1 self. ah[i]) sigma #UPDATEOUTPUTWEIGHTS for i in range(self.nh): for j in range(self.no): change=output_deltas[j] self.ah[i ] self.wo[i][j]=self.wo[i][j]+N change+M self.co[i][j] self.co[i][j]=change #UPDATEINPUTWEIGHTS 62

PAGE 68

for i in range(self.ni): for j in range(self.nh): change=hidden_deltas[j] self.ai[i ] self.wi[i][j]=self.wi[i][j]+N change+M self.ci[i][j] self.ci[i][j]=change #TOTALERROR error=0.0 for i in range(self.no): delta=targets[i] self.ao[i] error+=0.5 delta !! 2 return error def train(self,patterns,iterations=5000,N=0.5,M =0.9): #N:learningrate #M:momentumfactor for i in xrange(iterations): error=0.0 for p in patterns: inputs=p[0] targets=[p[1][0]] self.update(inputs) error+=self.backPropagate(targets, N,M) if i%100==0: print 'error:',error def test(self,patterns): sus=[] for p in patterns: su=self.update(p[0]) sus.append(su[0]) for i in range(len(sus)): patterns[i][1].append(sus[i]) return patterns 63

PAGE 69

Listing4:stats.py def accuracy(results): total=len(results) correct=0 t_mel=0 t_mole=0 c_mel=0 c_mole=0 agg_error=0 for r in results: ex=r[1][0] act=r[1][1] agg_error+=abs(ex act) if act<.5: act_t=0 else : act_t=1 if ex: #SENSITIVITY(CORRECTPOSITIVES) t_mel+=1 if act_t: c_mel+=1 correct+=1 else : #SPECIFICITY(CORRECTNEGATIVES) t_mole+=1 ifnot act_t: c_mole+=1 correct+=1 sensitivity=c_mel/float(t_mel) specificity=c_mole/float(t_mole) accuracy=correct/float(total) avg_error=agg_error/float(total) return [[total,correct,t_mel,c_mel,t_mole,c_mole ],sensitivity,specificity,accuracy,avg_error] 64

PAGE 70

Listing5:color-extract.py import cv import numpyasnp def getColors(img_m): print "...extractingcolors" img=np.asarray(img_m) brown_min=[120,65,65] brown_max=[200,130,130] white_min=[190,190,190] white_max=[255,255,255] black_min=[0,0,0] black_max=[80,80,80] blue_min=[50,50,125] blue_max=[130,130,255] red_min=[130,50,50] red_max=[255,100,100] colors=[] colors.append(searchColor(img,brown_min,brown_max)) colors.append(searchColor(img,white_min,white_max)) colors.append(searchColor(img,black_min,black_max)) colors.append(searchColor(img,blue_min,blue_max)) colors.append(searchColor(img,red_min,red_max)) for i in range(len(colors)): if (colors[i]): colors[i]=1 else : colors[i]=0 return colors def searchColor(img_,cmin,cmax): img__=np.asarray(img_) img=np.array(img__) for row in range(img.shape[0]): for col in range(img.shape[1]): x=img[row][col] if cmin[2]
PAGE 71

and cmin[1]=img.shape[0]): x=0 y+=5 if (y+size>=img.shape[1]): break box=img[x:x+size,:][:,y:y+size] n=0 for row in box: for pixel in row: if ( not pixel.any()): n+=1 if (n>=thresh): found=True x+=5 return found 66

PAGE 72

Listing6:corner-extract.py import cv import cv2 import numpy import image_blur def getCorners(img): print "...extractingcorners" img_b=image_blur.blurImage(img) img_arr=numpy.asarray(img_b) img_g_mat=cv2.cvtColor(img_arr,cv2.COLOR_RGB2GRAY) img_g=cv.fromarray(img_g_mat) cv.Scale(img_g,img_g,2) cv.Canny(img_g,img_g,200,220) eig_image=cv.CreateMat(img_g.rows,img_g.cols,cv. CV_8UC1) temp_image=cv.CreateMat(img_g.rows,img_g.cols,cv. CV_8UC1) corners=[] strength_thresh=0.8 for (x,y) in cv.GoodFeaturesToTrack(img_g,eig_image, temp_image,5,strength_thresh,10.0,blockSize =5): corners.append(1) while (len(corners)<5): corners.append(0) return corners 67

PAGE 73

Listing7:structure-extract.py import cv import cv2 import numpy import image_blur def getStructures(img): print "...extractingstructures" #CONVERTTOBLACKANDWHITE img_a=numpy.asarray(img) img_c=cv2.cvtColor(img_a,cv.CV_RGB2GRAY) img_c=cv.fromarray(img_c) #DEFINEREGIONOFINTEREST img_c=image_blur.blurImage(img_c) stor=cv.CreateMemStorage() img2=cv.CloneMat(img_c) cv.Zero(img2) cv.Threshold(img_c,img2,100,255,cv.CV_THRESH_BINARY) cont=cv.FindContours(img2,stor,cv.CV_RETR_LIST,cv. CV_CHAIN_APPROX_NONE,(0,0)) maxcenterx=0 maxcentery=0 maxsizex=0 maxsizey=0 maxangle=0 contour=None measures2=[] for c in contour_iterator(cont): if len(c)>=6: PointArray2D32f=cv.CreateMat(1,len (c),cv.CV_32FC2) for (i,(x,y)) in enumerate(c): PointArray2D32f[0,i]=(x,y ) #FITANELLIPSEAROUNDTHEPOINTS (center,size,angle)=cv. FitEllipse2(PointArray2D32f) 68

PAGE 74

center=(cv.Round(center[0]),cv. Round(center[1])) size=(cv.Round(size[0] 0.5),cv. Round(size[1] 0.5)) #STORETHENEWMAXELLIPSES if size[0]>maxsizex or size[1]> maxsizey: maxcenterx=center[0] maxcentery=center[1] maxsizex=size[0] maxsizey=size[1] maxangle=angle contour=c r=cv.BoundingRect( PointArray2D32f) measures2.append(r) #RETURNTHELARGESTELLIPSEORRECTANGLE(THATISN'T THEENTIREBORDEROFTHEIMAGE) if (len(measures2)<2): return [0] r=measures2[ 2] cv.Scale(img_c,img_c,2) cv.Canny(img_c,img_c,200,220) img_c=numpy.asarray(img_c) count=0 total=0 for x in range(r[2] 1): for y in range(r[3] 1): i=img_c[r[1]+y][r[0]+x] if (i.any()): count+=1 total+=1 #DELETEACOUNTDENOTINGTHECIRCUMFERENCEOFTHE ELLIPSETHATFITSINSIDETHEREGION,INANATTEMPT TONORMALIZE !! USECIRCUMFERENCEOFCIRCLEWITH RADIUSAVERAGEOFTHETWOAXES circum=((r[2]+r[3])/4) 3.14 ratio=(count circum)/float(total) 100 69

PAGE 75

#RETURNSTRUEIFRATIOISGREATERTHAN5%OFTHE REGIONOFINTEREST,ANDIFTHEREGIONIS SUFFICIENTLYLARGE if (ratio>5 and r[2] r[3]>20): structures=[1] else : structures=[0] return structures def contour_iterator(contour): while contour: yieldcontour contour=contour.h_next() 70

PAGE 76

Listing8:image-blur.py import cv import cv2 import numpy def blurImage(img): img_arr=numpy.asarray(img) sigma_color=60 sigma_space=20 bi_filt_mat=cv2.bilateralFilter(img_arr, 1, sigma_color,sigma_space) bi_filt=cv.fromarray(bi_filt_mat) return bi_filt 71

PAGE 77

Listing9:Sampleoutput !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! IMPORTINGTHEDATA !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! TRAININGTHENETWORK !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! error:5.09075367194 error:17.1295229478 error:18.3122883278 error:18.0245064757 error:18.2418582286 error:18.1364958022 error:17.9341228693 error:17.7953834987 error:17.3095563311 error:17.0625905663 error:16.7112176434 error:16.6339603499 error:16.9568052182 error:16.8266045433 error:16.3518436498 error:16.3268902898 error:16.2248581036 error:16.2395903161 error:16.2494250968 error:16.2526632909 error:16.2271156423 error:16.2233162416 error:16.2216423189 error:16.2213608217 error:16.221024769 error:16.2202050491 error:16.2186539405 error:16.2160403486 error:16.2120485318 error:16.2063557914 72

PAGE 78

error:16.1987450176 error:16.1896409496 error:16.1801999145 error:16.1695773391 error:16.1570886199 error:16.1479393628 error:16.1410826955 error:16.1352364442 error:16.1299563312 error:16.1250819598 error:16.1205362738 error:16.1162716589 error:16.1122542263 error:16.1084579056 error:16.1048616164 error:16.1014477166 error:16.0982010843 error:16.0951085404 error:16.0921584619 error:16.089340507 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CLASSIFYINGTHETESTSET !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [[[1,0,1,0,1,1,1,1,1,1,0],[1, 0.50000000255680987]], [[1,1,1,0,0,1,1,1,0,0,0],[1, 0.50011964901768124]], [[1,1,1,0,0,1,1,1,1,1,1],[1, 0.99637516479451549]], [[1,1,1,0,1,1,1,1,1,1,0],[1, 0.50000739804787309]], [[1,1,1,0,0,1,1,1,1,1,1],[1, 0.99637516479451549]], [[1,0,0,0,1,1,1,1,1,0,1],[1, 0.30372364761351178]], [[0,1,1,0,0,1,1,1,1,1,1],[1, 0.98529907712560372]], [[1,0,1,0,0,1,1,1,1,1,0],[1, 0.50001511939740761]], [[1,0,1,0,1,1,1,1,1,1,0],[1, 0.50000000255680987]], 73

PAGE 79

[[1,0,0,0,0,1,1,1,1,1,0],[1, 0.023723619286717772]], [[1,0,1,0,0,1,1,1,1,1,1],[1, 0.50093168418064304]], [[1,1,1,0,0,1,1,1,1,1,0],[1, 0.99637511307992255]], [[1,1,0,0,0,1,1,1,1,1,0],[1, 0.7723983215865774]], [[1,0,1,0,0,1,1,1,1,1,1],[1, 0.50093168418064304]], [[1,1,1,0,0,1,1,1,1,1,1],[1, 0.99637516479451549]], [[0,1,1,0,0,1,1,1,1,1,1],[1, 0.98529907712560372]], [[0,1,0,0,1,1,1,1,1,1,0],[1, 0.30442403432660664]], [[1,0,1,0,0,1,1,1,1,1,1],[1, 0.50093168418064304]], [[1,0,1,0,0,1,1,1,1,1,0],[1, 0.50001511939740761]], [[1,0,1,0,1,1,1,1,1,1,0],[1, 0.50000000255680987]], [[0,1,1,0,0,1,1,1,1,1,0],[1, 0.86837800350830896]], [[0,0,1,0,0,1,1,0,0,0,1],[0, 0.016370723537866984]], [[1,0,0,0,0,1,1,0,0,0,1],[0, 0.016379133648241291]], [[1,0,1,0,0,1,1,0,0,0,1],[0, 0.42669862657791791]], [[0,0,1,0,1,1,1,0,0,0,0],[0, 0.27219357184337278]], [[1,0,1,0,1,1,1,1,1,1,0],[0, 0.50000000255680987]], [[1,1,0,0,1,1,1,0,0,0,0],[0, 0.68997819573215236]], [[0,1,1,0,0,1,0,0,0,0,1],[0, 0.019340830455362307]], [[0,1,0,0,0,1,0,0,0,0,0],[0, 0.017019156010367409]], [[1,0,1,0,1,1,1,1,0,0,1],[0, 0.5000621934971573]], [[1,0,1,0,0,1,1,1,1,1,1],[0, 0.50093168418064304]], [[1,0,0,0,1,1,1,1,1,1,0],[0, 74

PAGE 80

0.49364998052874182]], [[0,1,0,0,0,1,1,0,0,0,1],[0, 0.61642573521371946]], [[1,0,1,0,0,1,1,0,0,0,1],[0, 0.42669862657791791]], [[0,0,0,0,0,1,1,1,1,1,1],[0, 0.016370840596957523]], [[1,0,0,0,1,1,1,0,0,0,0],[0, 0.041809036044299468]], [[1,0,0,0,1,1,1,1,1,1,0],[0, 0.49364998052874182]], [[1,0,0,0,0,1,1,1,1,0,1],[0, 0.016370495836683031]], [[1,0,1,0,0,1,1,1,1,1,1],[0, 0.50093168418064304]], [[0,0,0,0,0,1,1,0,0,0,1],[0, 0.016371999410031139]]] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ACCURACYSTATISTICS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Sizeofthetestset:40 Sensitivity=85.7142857143% Totalmelanoma=21 Correctlyidentifiedmelanoma=18 Specificity=68.4210526316% Totalbenignmelanocyticlesions=19 Correctlyidentifiedbenignmelanocyticlesions=13 Accuracy=77.5% Totallesions=40 Correctlyidentifiedlesions=31 Averageerror=33.3738844071% 75

PAGE 81

References [1]Ai-junkie.com. [2]dermquest.org. [3]Mayoclinic,August2011. [4]Deeplearning.net,2012. [5]Dermoscopy.org,2012. [6]Nationalcancerinstitutes,2012. [7]Worldhealthorganization,2012. [8]JamesA.Anderson. AnIntroductiontoNeuralNetworks .MassachusettsInstituteofTechnology,thirdedition,1997. [9]RossBerteig.Basicconceptsforneuralnetworks. [10]ChristopherBishop. PatternRecognitionandMachineLearning .SpringerScience+BusinessMedia,LLC,2006. [11]GaryBradskiandAdrianKaehler. LearningOpenCV:ComputerVisionwiththe OpenCVLibrary .O'ReillyMedia,Inc,rstedition,September2008. [12]PabloG.CavalcantiandJacobScharcanski.Automatedprescreeningofpigmentedskinlesionsusingstandardcameras. ComputerizedMedicalImagingand Graphics ,pages481491,2011. [13]FikretErcalet.al.Neuralnetworkdiagnosisofmalignantmelanomafromcolor images. TransationsonBiomedicalEngineering ,41(9),September1994. [14]G.Argenzianoet.al.Epiluminescencemicroscopyforthediagnosisofdoubtful melanocyticskinlesions.comparisonoftheabcdruleofdermatoscopyandanew 7-pointchecklistbasedonpatternanalysis. ArchDermatol ,December1998. [15]StephanDreiseitlet.al.Comparisonofmachinelearningmethodsforthediagnosisofpigmentedskinlesions. JournalofBiomedicalInformatics ,pages2836, January2001. [16]YiDinget.al.Acomputerassisteddiagnosissystemformalignantmelanoma using3dskinsurfacetexturefeaturesandarticialneuralnetwork. Int.J.Modelling,IdenticationandControl ,2009. [17]YoavFreundandRobertE.Schapire.Ashortintroductiontoboosting. Journal ofJapaneseSocietyforArticialIntelligence ,14(5):771780,September1999. [18]DonaldHebb. Theorganizationofbehavior .WileyandSons,1949. 76

PAGE 82

[19]Geo!reyE.Hinton.Deepbeliefnetworks. Scholarpedia ,4(5):5947,2009. [20]IliasMaglogiannisandCharlamposN.Doukas.Overviewofadvancedcomputer visionsystemsforskinlesionscharacterization. IEEETransactionsonInformationTechnologyinBiomedicine ,13(5):721733,September2009. [21]FranzNachbarandWilhelmStolz.Theabcdruleofdermatoscopy:Highprospectivevalueinthediagnosisofdoubtfulmelanocyticskinlesions. Journalofthe AmericanAcademyofDermatology ,30(4):551559,April1994. [22]OllyOechsle.Distinguishingmolesfrommelanomausingjasmine. [23]C.K.ReddyandF.A.Bhuyan.Retrievalandrankingofbiomedicalimagesusing boostedhaarfeatures.In BioInformaticsandBioEngineering,2008.BIBE2008. 8thIEEEInternationalConferenceon ,pages16,October2008. [24]NeilSchemenauer.http://arctrix.com/nas/. [25]StevenSkiena. CalculatedBets:Computers,Gambling,andMathematicalModelingtoWin .UniversityofCambridge,August2001. [26]ShereaM.Stricklin.Melanomainsituinaprivatepracticesetting2005through 2009:Location,lesionsize,lackofconcern. JAmAcadDermatol ,Jan2012. [27]PaulViolaandMichaelJones.Robustreal-timeobjectdetection. International JournalofComputerVision ,2001. [28]D.H.WolpertandW.G.Macready.Nofreelunchtheoremsforoptimization. IEEETransactionsonEvolutionaryComputation ,1(1):6782,April1997. 77


ERROR LOADING HTML FROM SOURCE (http://ncf.sobek.ufl.edu//design/skins/UFDC/html/footer_item.html)