www.it-ebooks.info
For your convenience Apress has placed some of the front
matter material after the index. Please use the Bookmarks
and Contents at a Glance links to access them.
www.it-ebooks.info
Contents at a Glance
AbouttheAuthors.................................................................................................. xi
AbouttheTechnicalReviewer .............................................................................. xii
Acknowledgments ............................................................................................... xiii
Introduction ......................................................................................................... xiv
Chapter1:GettingStarted ......................................................................................1
Chapter2:ApplicationFundamentals...................................................................23
Chapter3:DepthImageProcessing .....................................................................49
Chapter4:SkeletonTracking ...............................................................................85
Chapter5:AdvancedSkeletonTracking.............................................................121
Chapter6:Gestures ............................................................................................167
Chapter7:Speech...............................................................................................223
Chapter8:BeyondtheBasics.............................................................................255
Appendix:KinectMath........................................................................................291
Index ...................................................................................................................301
iv
www.it-ebooks.info
Introduction
Itiscustomarytoprefaceaworkwithanexplanationoftheauthor’saim,whyhewrotethebook,and
therelationshipinwhichhebelievesittostandtootherearlierorcontemporarytreatisesonthesame
subject.Inthecaseofatechnicalwork,however,suchanexplanationseemsnotonlysuperfluousbut,
inviewofthenatureofthesubject-matter,eveninappropriateandmisleading.Inthissense,atechnical
bookissimilartoabookaboutanatomy.Wearequitesurethatwedonotasyetpossessthesubjectmatteritself,thecontentofthescience,simplybyreadingaroundit,butmustinadditionexert
ourselvestoknowtheparticularsbyexaminingrealcadaversandbyperformingrealexperiments.
Technicalknowledgerequiresasimilarexertioninordertoachieveanylevelofcompetence.
Besidesthereader’sdesiretobehands-onratherthanheads-down,abookaboutKinect
developmentofferssomeadditionalchallengesduetoitsnovelty.TheKinectseemedtoarriveexnihilo
inNovemberof2010andattemptstointerfacewiththeKinecttechnology,originallyintendedonlytobe
usedwiththeXBOXgamingsystem,beganalmostimmediately.Thepopularityoftheseeffortstohack
theKinectappearstohavetakenevenMicrosoftunawares.
SeveralframeworksforinterpretingtherawfeedsfromtheKinectsensorhavebeenreleased
priortoMicrosoft’sofficialrevealoftheKinectSDKinJulyof2011includinglibfreenectdevelopedby
theOpenKinectcommunityandOpenNIdevelopedprimarilybyPrimeSense,vendorsofoneofthekey
technologiesusedintheKinectsensor.ThesurprisingnatureoftheKinect’sreleaseaswellas
Microsoft’sapparentfailuretoanticipatetheoverwhelmingdesireonthepartofdevelopers,hobbyists
andevenresearchscientiststoplaywiththetechnologymaygivetheimpressionthattheKinectSDKis
hodgepodgeorevenabrieflyflickeringfad.
ThegesturerecognitioncapabilitiesmadeaffordablebytheKinect,however,havebeen
researchedatleastsincethelate70’s.AbriefsearchonYouTubeforthephrase“putthatthere”will
bringupChrisSchmandt’s1979workwiththeMITMediaLabdemonstratingkeyKinectconceptssuch
asgesturetrackingandspeechrecognition.TheinfluenceofSchmandt’sworkcanbeseeninMark
Lucente’sworkwithgestureandspeechrecognitioninthe90’sforIBMResearchonaprojectcalled
DreamSpace.TheseearlyconceptscametogetherinthecentralimagefromStevenSpeilberg’s2002film
MinorityReportthatcapturedviewersimaginationsconcerningwhatthefutureshouldlooklike.That
imagewasofTomCruisewavinghisarmsandmanipulatinghiscomputerscreenswithouttouching
eitherthemonitorsoranyinputdevices.Inthemiddleofanotherwisedystopicsocietyfilledwith
roboticspiders,ubiquitousmarketingandpanopticonpolicesurveilence,StevenSpeilbergofferedusa
visionnotonlyofapossibletechnologicalfuturebutofafuturewewanted.
AlthoughMinorityReportwasintendedasavisionoftechnology50yearsinthefuture,thefirst
conceptvideosfortheKinect,code-namedProjectNatal,startedappearingonlysevenyearsafterthe
movie’srelease.Oneofthefirstthingspeoplenoticedaboutthetechnologywithrespecttoitscinematic
predecessorwasthattheKinectdidnotrequireTomCruise’sthree-fingered,blue-litglovestofunction.
WehadnotonlycaughtuptothefutureasenvisionedbyMinorityReportinrecordtimebuthadeven
surpassedit.
TheKinectisonlynewinthesensethatithasrecentlybecomeaffordableandfitformassproduction.Aspointedoutabove,ithasbeenanticipatedinresearchcirclesforover40years.The
xiv
www.it-ebooks.info
INTRODUCTION
principleconceptsofgesture-recognitionhavenotchangedsubstantiallyinthattime.Moreover,the
cinematicexplorationofgesture-recognitiondevicesdemonstratesthatthetechnologyhassucceededin
makingadeepconnectionwithpeople’simaginations,fillinganeedwedidnotknowwehad.
Inthenearfuture,readerscanexpecttoseeKinectsensorsbuiltintomonitorsandlaptopsas
gesture-basedinterfacesgaingroundinthemarketplace.Overthenextfewyears,Kinect-like
technologywillbeginappearinginretailstores,publicbuildings,mallsandmultiplelocationsinthe
home.Asthehardwareimprovesandbecomesubiquitous,theauthorsanticipatethattheKinectSDK
willbecometheleadingsoftwareplatformforworkingwithit.Althoughslowoutofthegatewiththe
KinectSDK,Microsoft’sexpertiseinplatformdevelopment,thefactthattheyownthetechnology,as
wellastheirintimateexperiencewiththeKinectforgamedevelopmentaffordsthemremarkable
advantagesoverthealternatives.Whilepredictionsaboutthefutureoftechnologyhavebeenshown,
overthepastfewyears,tobeatreacherousendeavor,theauthorspositwithsomeconfidencethatskills
gainedindevelopingwiththeKinectSDKwillnotbecomeobsoleteinthenearfuture.
Evenmoreimportant,however,developingwiththeKinectSDKisfuninawaythattypical
developmentisnot.Thepleasureofbuildingyourfirstskeletontrackingprogramisdifficulttodescribe.
Itisinordertosharethisineffableexperience--anexperiencefamiliartoanyonewhostillremembers
theirfirstsoftwareprogramandbecamesoftwaredevelopersinthebeliefthissenseofjoyand
accomplishmentwasrepeatable–thatwehavewrittenthisbook.
AboutThisBook
Thisbookisfortheinveteratetinkererwhocannotresistplayingwithcodesamplesbeforereadingthe
instructionsonwhythesamplesarewrittenthewaytheyare.Afterall,youboughtthisbookinorderto
findouthowtoplaywiththeKinectsensorandreplicatesomeoftheexcitingscenariosyoumayhave
seenonline.Weunderstandifyoudonotwanttoinitiallywadethroughdetailedexplanationsbefore
seeinghowfaryoucangetwiththesamplesonyourown.Atthesametime,wehaveincludedindepth
informationaboutwhytheKinectSDKworksthewayitdoesandtoprovideguidanceonthetricksand
pitfallsofworkingwiththeSDK.Youcanalwaysgobackandreadthisinformationatalaterpointasit
becomesimportanttoyou.
Thechaptersareprovidedinroughlysequentialorder,witheachchapterbuildinguponthe
chaptersthatwentbefore.Theybeginwiththebasics,moveontoimageprocessingandskeleton
tracking,thenaddressmoresophisticatedscenariosinvolvingcomplexgesturesandspeechrecognition.
FinallytheydemonstratehowtocombinetheSDKwithothercodelibrariesinordertobuildcomplex
effects.Theappendixoffersanoverviewofmathematicalandkinematicconceptsthatyouwillwantto
becomefamiliarwithasyouplanoutyourownuniqueKinectapplications.
ChapterOverview
Chapter1:GettingStarted
Yourimaginationisrunningwildwithideasandcooldesignsforapplications.Thereareafewthingsto
knowfirst,however.Thischapterwillcoverthesurprisinglylonghistorythatleduptothecreationofthe
KinectforWindowsSDK.Itwillthenprovidestep-by-stepinstructionsfordownloadingandinstalling
thenecessarylibrariesandtoolsneededtodevelopapplicationsfortheKinect.
Chapter2:ApplicationFundamentalsguidesthereaderthroughtheprocessofbuildingaKinect
application.Atthecompletionofthischapter,thereaderwillhavethefoundationneededtowrite
xv
www.it-ebooks.info
INTRODUCTION
relativelysophisticatedKinectapplicationsusingtheMicrosoftSDK.Thisincludesgettingdatafromthe
Kinecttodisplayaliveimagefeedaswellasafewtrickstomanipulatetheimagestream.Thebasiccode
introducedhereiscommontovirtuallyallKinectapplications.
Chapter3:DepthImageProcessing
ThedepthstreamisatthecoreofKinecttechnology.Thiscodeintensivechapterexplainsthedepth
streamindetail:whatdatatheKinectsensorprovidesandwhatcanbedonewiththisdata.Examples
includecreatingimageswhereusersareidentifiedandtheirsilhouettesarecoloredaswellassimple
tricksusingthesilhouettestodetermininethedistanceoftheuserfromtheKinectandotherusers.
Chapter4:SkeletonTracking
Byusingthedatafromthedepthstream,theMicrosoftSDKcandeterminehumanshapes.
Thisiscalledskeletontracking.Thereaderwilllearnhowtogetskeletontrackingdata,whatthatdata
meansandhowtouseit.Atthispoint,youwillknowenoughtohavesomefun.Walkthroughsinclude
visuallytrackingskeletonjointsandbones,andcreatingsomebasicgames.
Chapter5:AdvancedSkeletonTracking
Thereismoretoskeletontrackingthanjustcreatingavatarsandskeletons.Sometimesreadingand
processingrawKinectdataisnotenough.Itcanbevolatileandunpredictable.Thischapterprovides
tipsandtrickstosmoothoutthisdatatocreatemorepolishedapplications.Inthischapterwewillalso
movebeyondthedepthimageandworkwiththeliveimage.Usingthedataproducedbythedepth
imageandthevisualoftheliveimage,wewillworkwithanaugmentedrealityapplication.
Chapter6:Gestures
ThenextlevelinKinectdevelopmentisprocessingskeletontrackingdatatodetectusinggestures.
Gesturesmakeinteractingwithyourapplicationmorenatural.Infact,thereisawholefieldofstudy
dedicatedtonaturaluserinterfaces.ThischapterwillintroduceNUIandshowhowitaffectsapplication
development.Kinectissonewthatwell-establishedgesturelibrariesandtoolsarestilllacking.This
chapterwillgiveguidancetohelpdefinewhatagestureisandhowtoimplementabasicgesturelibrary.
Chapter7:Speech
TheKinectismorethanjustasensorthatseestheworld.Italsohearsit.TheKinecthasanarrayof
microphonesthatallowsittodetectandprocessaudio.Thismeansthattheusercanusevoice
commandsaswellasgesturestointeractwithanapplication.Inthischapter,youwillbeintroducedto
theMicrosoftSpeechRecognitionSDKandshownhowitisintegratedwiththeKinectmicrophone
array.
Chapter8:BeyondtheBasicsintroducesthereadertomuchmorecomplexdevelopmentthatcanbe
donewiththeKinect.Thischapteraddressesusefultoolsandwaystomanipulatedepthdatatocreate
complexapplicationsandadvancedKinectvisuals.
AppendixA:KinectMath
BasicmathskillsandformulasneededwhenworkingwithKinect.Givesonlypracticalinformation
neededfordevelopmenttasks.
xvi
www.it-ebooks.info
INTRODUCTION
WhatYouNeedtoUseThisBook
TheKinectSDKrequirestheMicrosoft.NETFramework4.0.Tobuildapplicationswithit,youwillneed
eitherVisualStudio2010ExpressoranotherversionofVisualStudio2010.TheKinectSDKmaybe
downloadedathttp://www.kinectforwindows.org/download/.
ThesamplesinthisbookarewrittenwithWPF4andC#.TheKinectSDKmerelyprovidesaway
toreadandmanipulatethesensorstreamsfromtheKinectdevice.Additionaltechnologyisrequiredin
ordertodisplaythisdataininterestingways.ForthisbookwehaveselectedWPF,thepreeminant
vectorgraphicplatformintheMicrosoftstackaswellasaplatformgenerallyfamiliartomostdevelopers
workingwithMicrosofttechnologies.C#,inturn,isthe.NETlanguagewiththegreatestpenetration
amongdevelopers.
AbouttheCodeSamples
Thecodesamplesinthisbookhavebeenwrittenforversion1.0oftheKinectforWindowsSDKreleased
onFebruary1st,2012.Youareinvitedtocopyanyofthecodeanduseitasyouwill,buttheauthorshope
youwillactuallyimproveuponit.Bookcode,afterall,isnotrealcode.Eachprojectandsnippetfound
inthisbookhasbeenselectedforitsabilitytoillustrateapointratherthanitsefficiencyinperforminga
task.WherepossiblewehaveattemptedtoprovidebestpracticesforwritingperformantKinectcode,
butwhenevergoodcodecollidedwithlegiblecode,legibilitytendedtowin.
Morepainfultous,giventhatboththeauthorsworkforadesignagency,wastherealization
thatthebookyouholdinyourhandsneededtobeaboutKinectcoderatherthanaboutKinectdesign.
Tothisend,wehavereinedinourimpulsetobuildelaboratepresentationlayersinfavorofspare,
workman-likedesigns.
Thesourcecodefortheprojectsdescribedinthisbookisavailablefordownloadat
http://www.apress.com/9781430241041.Thisistheofficialhomepageofthebook.Youcanalsocheck
forerrataandfindrelatedApresstitleshere.
xvii
www.it-ebooks.info
CHAPTER 1
Getting Started
Inthischapter,weexplainwhatmakesKinectspecialandhowMicrosoftgottothepointofprovidinga
KinectforWindowsSDK—somethingthatMicrosoftapparentlydidnotenvisionwhenitreleasedwhat
wasthoughtofasanewkindof“controller-free”controllerfortheXbox.Wetakeyouthroughthesteps
involvedininstallingtheKinectforWindowsSDK,plugginginyourKinectsensor,andverifyingthat
everythingisworkingthewayitshouldinordertostartprogrammingforKinect.Wethennavigate
throughthesamplesprovidedwiththeSDKanddescribetheirsignificanceindemonstratinghowto
programfortheKinect.
TheKinectCreationStory
ThehistoryofKinectbeginslongbeforethedeviceitselfwasconceived.Kinecthasrootsindecadesof
thinkinganddreamingaboutuserinterfacesbasedupongestureandvoice.Thehit2002movieThe
MinorityReportaddedfueltothefirewithitsfuturisticdepictionofaspatialuserinterface.Rivalry
betweencompetinggamingconsolesbroughttheKinecttechnologyintoourlivingrooms.Itwasthe
hackerethicofunlockinganythingintendedtobesealed,however,thateventuallyopeneduptheKinect
todevelopers.
Pre-History
BillBuxtonhasbeentalkingoverthepastfewyearsaboutsomethinghecallstheLongNoseof
Innovation.AplayonChrisAnderson’snotionoftheLongTail,theLongNosedescribesthedecadesof
incubationtimerequiredtoproducea“revolutionary”newtechnologyapparentlyoutofnowhere.The
classicexampleistheinventionandrefinementofadevicecentraltotheGUIrevolution:themouse.
ThefirstmouseprototypewasbuiltbyDouglasEngelbartandBillEnglish,thenattheStanford
ResearchInstitute,in1963.Theyevengavethedeviceitsmurinename.BillEnglishdevelopedthe
conceptfurtherwhenhetookittoXeroxPARCin1973.WithJackHawley,headdedthefamousmouse
balltothedesignofthemouse.Duringthissametimeperiod,TelefunkeninGermanywas
independentlydevelopingitsownrollerballmousedevicecalledtheTelefunkenRollkugel.By1982,the
firstcommercialmousebegantofinditswaytothemarket.Logitechbegansellingonefor$299.Itwas
somewhereinthisperiodthatSteveJobsvisitedXeroxPARCandsawthemouseworkingwithaWIMP
interface(windows,icons,menus,pointers).Sometimeafterthat,JobsinvitedBillGatestoseethe
mouse-basedGUIinterfacehewasworkingon.ApplereleasedtheLisain1983withamouse,andthen
equippedtheMacintoshwiththemousein1984.MicrosoftannounceditsWindowsOSshortlyafterthe
releaseoftheLisaandbegansellingWindows1.0in1985.Itwasnotuntil1995,withthereleaseof
Microsoft’sWindows95operatingsystem,thatthemousebecameubiquitous.TheLongNosedescribes
the30-yearspanrequiredfordeviceslikethemousetogofrominventiontoubiquity.
1
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
Asimilar30-yearLongNosecanbesketchedoutforKinect.Startinginthelate70s,abouthalfway
intothemouse’sdevelopmenttrajectory,ChrisSchmandtattheMITArchitectureMachineGroup
startedaresearchprojectcalledPut-That-There,basedonanideabyRichardBolt,whichcombined
voiceandgesturerecognitionasinputvectorsforagraphicalinterface.ThePut-That-Thereinstallation
livedinasixteen-footbyeleven-footroomwithalargeprojectionscreenagainstonewall.Theusersat
inavinylchairabouteightfeetinfrontofthescreenandhadamagneticcubehiddenuponewristfor
spatialinputaswellasahead-mountedmicrophone.Withtheseinputs,andsomerudimentaryspeech
parsinglogicaroundpronounslike“that”and“there,”theusercouldcreateandmovebasicshapes
aroundthescreen.Boltsuggestsinhis1980paperdescribingtheproject,“Put-That-There:Voiceand
GestureattheGraphicsInterface,”thateventuallythehead-mountedmicrophoneshouldbereplaced
withadirectionalmic.SubsequentversionsofPut-That-Therealloweduserstoguideshipsthroughthe
CaribbeanandplacecolonialbuildingsonamapofBoston.
AnotherMITMediaLabsresearchprojectfrom1993byDavidKoonz,KristinnThorrison,and
CarltonSparrell—andagaindirectedbyBolt—calledTheIconicSystemrefinedthePut-That-There
concepttoworkwithspeechandgestureaswellasathirdinputmodality:eye-tracking.Also,insteadof
projectinginputontoatwo-dimensionalspace,thegraphicalinterfacewasacomputer-generatedthreedimensionalspace.InplaceofthemagneticcubesusedforPut-That-There,theIconicSystemincluded
specialglovestofacilitategesturetracking.
Towardsthelate90s,MarkLucentedevelopedanadvanceduserinterfaceforIBMResearchcalled
DreamSpace,whichranonavarietyofplatformsincludingWindowsNT.ItevenimplementedthePutThat-TheresyntaxofChrisSchmandt’s1979project.Unlikeanyofitspredecessors,however,
DreamSpacedidnotusewandsorglovesforgesturerecognition.Instead,itusedavisionsystem.
Moreover,LucenteenvisionedDreamSpacenotonlyforspecializedscenariosbutalsoasaviable
alternativetostandardmouseandkeyboardinputsforeverydaycomputing.Lucentehelpedto
popularizespeechandgesturerecognitionbydemonstratingDreamSpaceattradeshowsbetween1997
and1999.
In1999JohnUnderkoffler—alsowithMITMediaLabsandacoauthorwithMarkLucenteonapaper
afewyearsearlieronholography—wasinvitedtoworkonanewStephenSpielbergprojectcalledThe
MinorityReport.UnderkofflereventuallybecametheScienceandTechnologyAdvisoronthefilmand,
withAlexMcDowell,thefilm’sProductionDesigner,puttogethertheuserinterfaceTomCruiseusesin
themovie.SomeofthedesignconceptsfromTheMinorityReportUIeventuallyendedupinanother
projectUnderkofflerworkedoncalledG-Speak.
PerhapsUnderkoffler’smostfascinatingdesigncontributiontothefilmwasasuggestionhemade
toSpielbergtohaveCruiseaccidentlyputhisvirtualdesktopintodisarraywhenheturnsandreaches
outtoshakeColinFarrell’shand.Itisascenethatcapturesthejarringacknowledgmentthateven
“smart”computerinterfacesareultimatelystillreliantonconventionsandthattheseconventionsare
easilyunderminedbytheuncannyfacticityofreallife.
TheMinorityReportwasreleasedin2002.Thefilmvisualsimmediatelyseepedintothecollective
unconscious,hanginginthezeitgeistlikeapromissorynote.Amilddiscontentovertheprevalenceof
themouseinourdailylivesbegantobefelt,andthepressaswellaspopularattentionbegantoturn
towardwhatwecametocalltheNaturalUserInterface(NUI).Microsoftbeganworkingonitsinnovative
multitouchplatformSurfacein2003,beganshowingitin2007,andeventuallyreleaseditin2008.Apple
unveiledtheiPhonein2007.TheiPadbegansellingin2010.AseachNUItechnologycametomarket,it
wasaccompaniedbycomparisonstoTheMinorityReport.
TheMinorityReport
SomuchinkhasbeenspilledabouttheobviousinfluenceofTheMinorityReportonthedevelopmentof
KinectthatatonepointIinsistedtomyco-authorthatweshouldtrytoavoideverusingthewords
2
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
“minority”and“report”togetheronthesamepage.InthisendeavorIhavefailedmiserablyandconcede
thatavoidingmentionofTheMinorityReportwhendiscussingKinectisvirtuallyimpossible.
OneofthemorepeculiarresponsestothemoviewasthemoviecriticRogerEbert’sopinionthatit
offeredan“optimisticpreview”ofthefuture.TheMinorityReport,basedlooselyonashortstoryby
PhilipK.Dick,depictsafutureinwhichpolicesurveillanceispervasivetothepointofpredictingcrimes
beforetheyhappenandincarceratingthosewhohavenotyetcommittedthecrimes.Itincludes
massivelypervasivemarketinginwhichretinalscansareusedinpublicplacestotargetadvertisements
topedestriansbasedondemographicdatacollectedonthemandstoredinthecloud.Genetic
experimentationresultsinmonstrouslycarnivorousplants,robotspidersthatroamthestreets,a
thrivingblackmarketinbodypartsthatallowspeopletochangetheiridentitiesand—perhapsthemost
jarringfuturepredictionofall—policemenwearingrocketpacks.
PerhapswhatEbertrespondedtowasthenotionthattheworldofTheMinorityReportwasa
believablefuture,extrapolatedfromourworld,demonstratingthatthroughtechnologyourworldcan
actuallychangeandnotmerelybemoreofthesame.Evenifitintroducesnewproblems,sciencefiction
reinforcestheideathattechnologycanhelpusleaveourcurrentproblemsbehind.Inthe1958book,The
HumanCondition,theauthorandphilosopherHannahArendtcharacterizestheroleofsciencefiction
insocietybysaying,“…sciencehasrealizedandaffirmedwhatmenanticipatedindreamsthatwere
neitherwildnoridle…buriedinthehighlynon-respectableliteratureofsciencefiction(towhich,
unfortunately,nobodyyethaspaidtheattentionitdeservesasavehicleofmasssentimentsandmass
desires).”Whilewemaynotallbecravingrocketpacks,wedoallatleasthavetheaspirationthat
technologywillsignificantlychangeourlives.
WhatispeculiaraboutTheMinorityReportand,beforethat,sciencefictionseriesliketheStarTrek
franchise,isthattheydonotalwaysmerelypredictthefuturebutcanevenshapethatfuture.WhenI
firstwalkedthroughautomaticslidingdoorsatalocalconveniencestore,Iknewthiswasbasedonthe
slidingdoorsontheUSSEnterprise.WhenIheldmyfirstflipphoneinmyhands,Iknewitwasbasedon
CaptainKirk’scommunicatorand,moreover,wouldneverhavebeendesignedthiswayhadStarTrek
neverairedontelevision.
IfTheMinorityReportdrovethedesignandadoptionofthegesturerecognitionsystemonKinect,
StarTrekcanbesaidtohavedriventhespeechrecognitioncapabilitiesofKinect.Ininterviewswith
Microsoftemployeesandexecutives,therearerepeatedreferencestothedesiretomakeKinectworklike
theStarTrekcomputerortheStarTrekholodeck.Thereisasenseinthoseinterviewsthatifthespeech
recognitionportionofthedevicewasnotsolved(andoccasionallytherewerediscussionsabout
droppingthefeatureasitfellbehindschedule),theKinectsensorwouldnothavebeenthefuturedevice
everyonewanted.
Microsoft’sSecretProject
Inthegamingworld,Nintendothrewdownthegauntletatthe2005TokyoGameShowconferencewith
theunveilingoftheWiiconsole.TheconsolewasaccompaniedbyanewgamingdevicecalledtheWii
Remote.LikethemagneticcubesfromtheoriginalPut-That-Thereproject,theWiiRemotecandetect
movementalongthreeaxes.Additionally,theremotecontainsanopticalsensorthatdetectswhereitis
pointing.Itisalsobatterypowered,eliminatinglongcordstotheconsolecommontootherplatforms.
FollowingthereleaseoftheWiiin2006,PeterMoore,thenheadofMicrosoft’sXboxdivision,
demandedworkstartonacompetitiveWiikiller.ItwasalsoaroundthistimethatAlexKipman,headof
anincubationteaminsidetheXboxdivision,metthefoundersofPrimeSenseatthe2006Electronic
EntertainmentExpo.MicrosoftcreatedtwocompetingteamstocomeupwiththeintendedWiikiller:
oneworkingwiththePrimeSensetechnologyandtheotherworkingwithtechnologydevelopedbya
companycalled3DV.ThoughtheoriginalgoalwastounveilsomethingatE32007,neitherteamseemed
tohaveanythingsufficientlypolishedintimefortheexposition.Thingswerethrownabitmoreofftrack
in2007whenPeterMooreannouncedthathewasleavingMicrosofttogoworkforElectronicArts.
3
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
Itisclearthatbythesummerof2007thesecretworkbeingdoneinsidetheXboxteamwasgaining
momentuminternallyatMicrosoft.AttheD:AllThingsDigitalconferencethatyear,BillGateswas
interviewedside-by-sidewithSteveJobs.Duringthatinterview,inresponsetoaquestionabout
MicrosoftSurfaceandwhethermultitouchwouldbecomemainstream,Gatesbegantalkingaboutvision
recognitionasthestepbeyondmultitouch:
Gates:Softwareisdoingv ision.Ands o,imagine agamemachinewherey oujustcan
pickupthebatandswingitorpickupthetennisracketandswingit.
Interviewer:Wehaveoneofthose.That’sWii.
Gates:No.No.That’snotit.Youcan’t pick upyour tennisracketandswingit.You
can’tsitth erewithyourfri endsandd othosenaturalthings.That’sa3-Dpo sitional
device.This isvideor ecognition.This is a camera seeing what’s going on. In a
meeting,wh enyouar eon ac onference,youdon’t k nowwho’ ss peakingwh enit’s
audioonly …the camerawillb eubiquitous …softwar e cand ovisi on,it candoit
very,veryinexpensively…andthatmeansthisstuffbecomespervasive.Youdon’tjust
talkab outit beinginala ptopd evice. Yout alkab outitbeinga part ofth em eeting
roomorthelivingroom…
Amazinglytheinterviewer,WaltMossberg,cutGatesoffduringhisfugueaboutthefutureof
technologyandturnedtheconversationbacktowhatwasmostimportantin2007:laptops!
Nevertheless,GatesrevealedinthisinterviewthatMicrosoftwasalreadythinkingofthenewtechnology
beingdevelopedintheXboxteamassomethingmorethanmerelyagamingdevice.Itwasalready
thoughtofasadevicefortheofficeaswell.
FollowingMoore’sdeparture,DonMatricktookupthereigns,guidingtheXboxteam.In2008,he
revivedthesecretvideorecognitionprojectaroundthePrimeSensetechnology.While3DV’stechnology
apparentlynevermadeitintothefinalKinect,Microsoftboughtthecompanyin2009for$35million.
ThiswasapparentlydoneinordertodefendagainstpotentialpatentdisputesaroundKinect.Alex
Kipman,amanagerwithMicrosoftsince2001,wasmadeGeneralManagerofIncubationandputin
chargeofcreatingthenewProjectNataldevicetoincludedepthrecognition,motiontracking,facial
recognition,andspeechrecognition.
NoteWhat’sinaname?Microsofthastraditionally,ifnotconsistently,givencitynamestolargeprojectsas
theircodenames.AlexKipmandubbedthesecretXboxprojectNatal,afterhishometowninBrazil.
ThereferencedevicecreatedbyPrimeSenseincludedanRGBcamera,aninfraredsensor,andan
infraredlightsource.MicrosoftlicensedPrimeSense’sreferencedesignandPS1080chipdesign,which
processeddepthdataat30framespersecond.Importantly,itprocesseddepthdatainaninnovativeway
thatdrasticallycutthepriceofdepthrecognitioncomparedtotheprevailingmethodatthetimecalled
“timeofflight”—atechniquethattracksthetimeittakesforabeamoflighttoleaveandthenreturnto
thesensor.ThePrimeSensesolutionwastoprojectapatternofinfrareddotsacrosstheroomanduse
thesizeandspacingbetweendotstoforma320X240pixeldepthmapanalyzedbythePS1080chip.The
4
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
chipalsoautomaticallyalignedtheinformationfortheRGBcameraandtheinfraredcamera,providing
RGBDdatatohighersystems.
Microsoftaddedafour-piecemicrophonearraytothisbasicstructure,effectivelyprovidinga
directionmicrophoneforspeechrecognitionthatwouldbeeffectiveinalargeroom.Microsoftalready
hadyearsofexperiencewithspeechrecognition,whichhasbeenavailableonitsoperatingsystems
sinceWindowsXP.
KudoTsunada,recentlyhiredawayfromElectronicArts,wasalsobroughtontheproject,leading
hisownincubationteam,tocreateprototypegamesforthenewdevice.HeandKipmanhadadeadline
ofAugust18,2008,toshowagroupofMicrosoftexecutiveswhatProjectNatalcoulddo.Tsunada’steam
cameupwith70prototypes,someofwhichwereshowntotheexecs.Theprojectgotthegreenlightand
therealworkbegan.TheyweregivenalaunchdateforProjectNatal:Christmasof2010.
MicrosoftResearch
WhilethehardwareproblemwasmostlysolvedthankstoPrimeSense—allthatremainedwastogivethe
deviceasmallerformfactor—thesoftwarechallengesseemedinsurmountable.First,aresponsive
motionrecognitionsystemhadtobecreatedbasedontheRGBandDepthdatastreamscomingfrom
thedevice.Next,seriousscrubbinghadtobeperformedinordertomaketheaudiofeedworkablewith
theunderlyingspeechplatform.TheProjectNatalteamturnedtoMicrosoftResearch(MSR)tohelp
solvetheseproblems.
MSRisamultibilliondollarannualinvestmentbyMicrosoft.ThevariousMSRlocationsaretypically
dedicatedtopureresearchincomputerscienceandengineeringratherthantotryingtocomeupwith
newproductsfortheirparent.Itmusthaveseemedstrange,then,whentheXboxteamapproached
variousbranchesofMicrosoftResearchtonotonlyhelpthemcomeupwithaproductbuttodoso
accordingtotherhythmsofaveryshortproductcycle.
Inlate2008,theProjectNatalteamcontactedJamieShottonattheMSRofficeinCambridge,
England,tohelpwiththeirmotion-trackingproblem.ThemotiontrackingsolutionKipman’steam
cameupwithhadseveralproblems.First,itreliedontheplayergettingintoaninitialT-shapedposeto
allowthemotioncapturesoftwaretodiscoverhim.Next,itwouldoccasionallylosetheplayerduring
motion,obligatingtheplayertoreinitializethesystembyonceagainassumingtheTposition.Finally,
themotiontrackingsoftwarewouldonlyworkwiththeparticularbodytypeitwasdesignedfor—thatof
Microsoftexecutives.
Ontheotherhand,thedepthdataprovidedbythesensoralreadysolvedseveralmajorproblemsfor
motiontracking.Thedepthdataallowseasyfilteringofanypixelsthatarenottheplayer.Extraneous
informationsuchasthecolorandtextureoftheplayer’sclothesarealsofilteredoutbythedepthcamera
data.Whatisleftisbasicallyaplayerblobrepresentedinpixelpositions,asshowninFigure1-1.The
depthcameradata,additionally,providesinformationabouttheheightandwidthoftheplayerin
meters.
5
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
Figure1-1.ThePlayerblob
ThechallengeforShottonwastoturnthisoutlineofapersonintosomethingthatcouldbetracked.
Theproblem,ashesawit,wastobreakuptheplayerblobprovidedbythedepthstreaminto
recognizablebodyparts.Fromthesebodyparts,jointscanbeidentified,andfromthesejoints,a
skeletoncanbereconstructed.WorkingwithAndrewFitzgibbonandAndrewBlake,Shottonarrivedat
analgorithmthatcoulddistinguish31bodyparts(seeFigure1-2).Outoftheseparts,theversionof
KinectdemonstratedatE3in2009couldproduce48joints(theKinectSDK,bycontrast,exposes20
joints).
Figure1-2.Playerparts
6
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
TogetaroundtheinitialT-poserequiredoftheplayerforcalibration,Shottondecidedtoappealto
thepowerofcomputerlearning.Withlotsandlotsofdata,theimagerecognitionsoftwarecouldbe
trainedtobreakuptheplayerblobintousablebodyparts.Teamsweresentouttovideotapepeoplein
theirhomesperformingbasicphysicalmotions.AdditionaldatawascollectedinaHollywoodmotion
capturestudioofpeopledancing,running,andperformingacrobatics.Allofthisvideowasthenpassed
throughadistributedcomputationenginecalledDryadthathadbeendevelopedbyanotherbranchof
MicrosoftResearchinMountainView,California,inordertobegingeneratingadecisiontreeclassifier
thatcouldmapanygivenpixelofKinect’sRGBDstreamontooneofthe31bodyparts.Thiswasdonefor
12differentbodytypesandrepeatedlytweakedtoimprovethedecisionsoftware’sabilitytoidentifya
personwithoutaninitialpose,withoutbreaksinrecognition,andfordifferentkindsofpeople.
ThistookcareofTheMinorityReportaspectofKinect.TohandletheStarTrekportion,AlexKipman
turnedtoIvanTashevoftheMicrosoftResearchgroupbasedinRedmond.Tashevandhisteamhad
workedonthemicrophonearrayimplementationonWindowsVista.Justasbeingabletofilteroutnonplayerpixelsisalargepartoftheskeletalrecognitionsolution,filteringoutbackgroundnoiseona
microphonearraysituatedmuchclosertoastereosystemthanitistothespeakerwasthebiggestpartof
makingspeechrecognitionworkonKinect.Usingacombinationofpatentedtechnologies(providedto
usforfreeintheKinectforWindowsSDK),Tashev’steamcameupwithinnovativenoisesuppression
andechocancellationtricksthatimprovedtheaudioprocessingpipelinemanytimesoverthestandard
thatwasavailableatthetime.
Basedonthisaudioscrubbing,adistributedcomputerlearningprogramofathousandcomputers
spentaweekbuildinganacousticalmodelforKinectbasedonvariousAmericanregionalaccentsand
thepeculiaracousticpropertiesoftheKinectmicrophonearray.Thismodelbecamethebasisofthe
TellMefeatureincludedwiththeXboxaswellastheKinectforWindowsRuntimeLanguagePackused
withtheKinectforWindowsSDK.Cuttingthingsveryclose,theacousticalmodelwasnotcompleted
untilSeptember26,2010.Shortlyafter,onNovember4,theKinectsensorwasreleased.
TheRacetoHackKinect
ThereleaseoftheKinectsensorwasmetwithmixedreviews.Gamingsitesgenerallyacknowledgedthat
thetechnologywascoolbutfeltthatplayerswouldquicklygrowtiredofthegameplay.Thisdidnotslow
downKinectsaleshowever.Thedevicesoldanaverageof133thousandunitsadayforthefirst60days
afterthelaunch,breakingthesalesrecordsforeithertheiPhoneortheiPadandsettinganewGuinness
worldrecord.Itwasn’tthatthegamingreviewsiteswerewrongaboutthenoveltyfactorofKinect;itwas
justthatpeoplewantedKinectanyways,whethertheyplayedwithiteverydayoronlyforafewhours.It
wasapieceofthefuturetheycouldhaveintheirlivingrooms.
Theexcitementintheconsumermarketwasmatchedbytheexcitementinthecomputerhacking
community.ThehackingstorystartswithJohnnyChungLee,themanwhooriginallyhackedaWii
RemotetoimplementfingertrackingandwaslaterhiredontotheProjectNatalteamtoworkongesture
recognition.FrustratedbythefailureofinternaleffortsatMicrosofttopublishapublicdriver,Lee
approachedAdaFruit,avendorofopen-sourceelectronickits,tohostacontesttohackKinect.The
contest,announcedonthedayoftheKinectlaunch,wasbuiltaroundaninterestinghardwarefeatureof
theKinectsensor:itusesastandardUSBconnectortotalktotheXbox.ThissameUSBconnectorcanbe
pluggedintotheUSBportofanyPCorlaptop.Thefirstpersontosuccessfullycreateadriverforthe
deviceandwriteanapplicationconvertingthedatastreamsfromthesensorintovideoanddepth
displayswouldwinthe$1,000bountythatLeehadputupforthecontest.
Onthesameday,MicrosoftmadethefollowingstatementinresponsetotheAdaFruitcontest:
“Microsoftdoesnotcondonethemodificationofitsproducts…WithKinect,Microsoftbuiltin
numeroushardwareandsoftwaresafeguardsdesignedtoreducethechancesofproducttampering.
Microsoftwillcontinuetomakeadvancesinthesetypesofsafeguardsandworkcloselywithlaw
7
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
enforcementandproductsafetygroupstokeepKinecttamper-resistant."LeeandAdaFruitresponded
byraisingthebountyto$2,000.
ByNovember6,JoshuaBlake,SethSandler,andKyleMachulisandothershadcreatedthe
OpenKinectmailinglisttohelpcoordinateeffortsaroundthecontest.Theirnotionwasthatthedriver
problemwassolvablebutthatthelongevityoftheKinecthackingeffortforthePCwouldinvolvesharing
informationandbuildingtoolsaroundthetechnology.TheywerealreadylookingbeyondtheAdaFruit
contestandimaginingwhatwouldcomeafter.InaNovember7posttothelist,theyevenproposed
sharingthebountywiththeOpenKinectcommunity,ifsomeoneonthelistwonthecontest,inorder
lookpastthemoneyandtowardwhatcouldbedonewiththeKinecttechnology.Theirmailinglistwould
goontobethehomeoftheKinecthackingcommunityforthenextyear.
SimultaneouslyonNovember6,ahackerknownasAlexPwasabletocontrolKinect’smotorsand
readitsaccelerometerdata.TheAdaFruitbountywasraisedto$3,000.OnMonday,November8,AlexP
postedvideoshowingthathecouldpullbothRGBanddepthdatastreamsfromtheKinectsensorand
displaythem.Hecouldnotcollecttheprize,however,becauseofconcernsaboutopensourcinghis
code.Onthe8,Microsoftalsoclarifieditspreviouspositioninawaythatappearedtoallowtheongoing
effortstohackKinectaslongasitwasn’tcalled“hacking”:
KinectforXbox360hasnotbeenhacked—inanyway—asthesoftwareandhardware
thatare partofKin ectfor Xbox360ha venotbe enmodif ied.W hathasha ppenedis
someonehascreateddriversthat allowotherdevicestoin terfacewiththe Kinectfor
Xbox360.Thecreationofthesedrivers,andtheus eofKinectforXbox360withother
devices,isun supported.Westronglyen couragecustomerstouse KinectforXbox360
withtheirXbox360togetthebestexperiencepossible.
OnNovember9,AdaFruitfinallyreceivedaUSBanalyzer,theBeagle480,inthemailandsettowork
publishingUSBdatadumpscomingfromtheKinectsensor.TheOpenKinectcommunity,calling
themselves“TeamTiger,”beganworkingonthisdataoveranIRCchannelandhadmadesignificant
progressbyWednesdaymorningbeforegoingtosleep.Atthesametime,however,HectorMartin,a
computersciencemajorinBilbao,Spain,hadjustpurchasedKinectandhadbegungoingtothroughthe
AdaFruitdata.WithinafewhourshehadwrittenthedriverandapplicationtodisplayRGBanddepth
video.TheAdaFruitprizehadbeenclaimedinonlysevendays.
MartinbecameacontributortotheOpenKinectgroupandanewlibrary,libfreenect,becamethe
basisofthecommunity’shackingefforts.JoshuaBlakeannouncedMartin’scontributiontothe
OpenKinectmailinglistinthefollowingpost:
IgotaholdofHectoronIRCjustafterhepostedthevideoandtalkedtohimaboutthis
group.Hesaidhe'dbehappytojoinus(andinfacthasalreadysubscribed).Afterhe
sleepstorecover,we'lltalksomemoreaboutintegratinghisworkandourwork.
Thisiswhentherealfunstarted.ThroughoutNovember,peoplestartedtopostvideosonthe
InternetshowingwhattheycoulddowithKinect.Kinect-basedartisticdisplays,augmentedreality
experiences,androboticsexperimentsstartedshowinguponYouTube.SiteslikeKinectHacks.net
spranguptotrackallthethingspeoplewerebuildingwithKinect.ByNovember20,someonehad
postedavideoofalightsabersimulatorusingKinect—anothermovieaspirationcheckedoff.Microsoft,
meanwhile,wasnotidle.ThecompanywatchedwithexcitementashundredsofKinecthacksmade
theirwaytotheweb.
OnDecember10,PrimeSenseannouncedthereleaseofitsownopensourcedriversforKinectalong
withlibrariesforworkingwiththedata.Thisprovidedimprovementstotheskeletontrackingalgorithms
8
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
overwhatwasthenpossiblewithlibfreenectandprojectsthatrequiredintegrationofRGBanddepth
databeganmigratingovertotheOpenNItechnologystackthatPrimeSensehadmadeavailable.Without
thekeyMicrosoftResearchtechnologies,however,skeletontrackingwithOpenNIstillrequiredthe
awkwardT-posetoinitializeskeletonrecognition.
OnJune17,2011,MicrosoftfinallyreleasedtheKinectSDKbetatothepublicunderanoncommerciallicenseafterdemonstratingitforseveralweeksateventslikeMIX.Aspromised,itincluded
theskeletonrecognitionalgorithmsthatmakeaninitialposeunnecessaryaswellastheAECtechnology
andacousticmodelsrequiredtomakeKinectspeechrecognitionsystemworkinalargeroom.Every
developernowhadaccesstothesametoolsMicrosoftusedinternallyfordevelopingKinectapplications
forthecomputer.
TheKinectforWindowsSDK
TheKinectforWindowsSDKisthesetoflibrariesthatallowsustoprogramapplicationsonavarietyof
MicrosoftdevelopmentplatformsusingtheKinectsensorasinput.Withit,wecanprogramWPF
applications,WinFormsapplications,XNAapplicationsand,withalittlework,evenbrowser-based
applicationsrunningontheWindowsoperatingsystem—though,oddlyenough,wecannotcreateXbox
gameswiththeKinectforWindowsSDK.DeveloperscanusetheSDKwiththeXboxKinectSensor.In
ordertouseKinect'snearmodecapabilities,however,werequiretheofficialKinectforWindows
hardware.Additionally,theKinectforWindowssensorisrequiredforcommercialdeployments.
UnderstandingtheHardware
TheKinectforWindowsSDKtakesadvantageofandisdependentuponthespecializedcomponents
includedinallplannedversionsoftheKinectdevice.InordertounderstandthecapabilitiesoftheSDK,
itisimportanttofirstunderstandthehardwareittalksto.TheglossyblackcasefortheKinect
componentsincludesaheadaswellasabase,asshowninFigure1-3.Theheadis12inchesby2.5
inchesby1.5inches.Theattachmentbetweenthebaseandtheheadismotorized.Thecasehidesan
infraredprojector,twocameras,fourmicrophones,andafan.
Figure1-3.TheKinectcase
9
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
IdonotrecommendeverremovingtheKinectcase.Inordertoshowtheinternalcomponents,
however,Ihaveremovedthecase,asshowninFigure1-4.OnthefrontofKinect,fromlefttoright
respectivelywhenfacingKinect,youwillfindthesensorsandlightsourcethatareusedtocaptureRGB
anddepthdata.Tothefarleftistheinfraredlightsource.NexttothisistheLEDreadyindicator.Nextis
thecolorcamerausedtocollectRGBdata,andfinally,ontheright(towardthecenteroftheKinect
head),istheinfraredcamerausedtocapturedepthdata.Thecolorcamerasupportsamaximum
resolutionof1280x960whilethedepthcamerasupportsamaximumresolutionof640x480.
Figure1-4.TheKinectcomponents
OntheundersideofKinectisthemicrophonearray.Themicrophonearrayiscomposedoffour
differentmicrophones.Oneislocatedtotheleftoftheinfraredlightsource.Theotherthreeareevenly
spacedtotherightofthedepthcamera.
IfyouboughtaKinectsensorwithoutanXboxbundle,theKinectcomeswithaY-cable,which
extendstheUSBconnectorwireonKinectaswellasprovidingadditionalpowertoKinect.TheUSB
extenderisrequiredbecausethemaleconnectorthatcomesoffofKinectisnotastandardUSB
connector.TheadditionalpowerisrequiredtorunthemotorsontheKinect.
IfyoubuyanewXboxbundledwithKinect,youwilllikelynothaveaY-cableincludedwithyour
purchase.ThisisbecausethenewerXboxconsoleshaveaproprietaryfemaleUSBconnectorthatworks
withKinectasisanddoesnotrequireadditionalpowerfortheKinectservos.Thisisaproblem—anda
sourceofenormousconfusion—ifyouintendtouseKinectforPCdevelopmentwiththeKinectSDK.
YouwillneedtopurchasetheY-cableseparatelyifyoudidnotgetitwithyourKinect.Itistypically
marketedasaKinectACAdapterorKinectPowerSource.SoftwarebuiltusingtheKinectSDKwillnot
workwithoutit.
AfinalpieceofinterestingKinecthardwaresoldbyNycoratherthanbyMicrosoftiscalledthe
KinectZoom.ThebaseKinecthardwareperformsdepthrecognitionbetween0.8and4meters.The
KinectZoomisasetoflensesthatfitoverKinect,allowingtheKinectsensortobeusedinroomssmaller
thanthestandarddimensionsMicrosoftrecommends.ItisparticularlyappealingforusersoftheKinect
SDKwhomightwanttouseitforspecializedfunctionalitysuchascustomfingertrackinglogicor
productivitytoolimplementationsinvolvingapersonsittingdowninfrontofKinect.From
10
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
experimentation,itactuallyturnsouttonotbeverygoodforplayinggames,perhapsduetothequality
ofthelenses.
KinectforWindowsSDKHardwareandSoftwareRequirements
UnlikeotherKinectlibraries,theKinectforWindowsSDK,asitsnamesuggests,onlyrunsonWindows
operatingsystems.Specifically,itrunsonx86andx64versionsofWindows7.Ithasbeenshowntoalso
workonearlyversionsofWindows8.BecauseKinectwasdesignedforXboxhardware,itrequires
roughlysimilarhardwareonaPCtoruneffectively.
HardwareRequirements
•
Computerwithadual-core,2.66-GHzorfasterprocessor
•
Windows7–compatiblegraphicscardthatsupportsMicrosoftDirectX9.0c
capabilities
•
2GBofRAM(4GBorRAMrecommended)
•
KinectforXbox360sensor
•
KinectUSBpoweradapter
UsethefreeVisualStudio2010ExpressorotherVS2010editionstoprogramagainsttheKinectfor
WindowsSDK.YouwillalsoneedtohavetheDirectX9.0cruntimeinstalled.LaterversionsofDirectX
arenotbackwardscompatible.Youwillalso,ofcourse,wanttodownloadandinstallthelatestversionof
theKinectforWindowsSDK.TheKinectSDKinstallerwillinstalltheKinectdrivers,theMicrosoft
ResearchKinectassembly,aswellascodesamples.
SoftwareRequirements
•
MicrosoftVisualStudio2010ExpressorotherVisualStudio2010edition:
http://www.microsoft.com/visualstudio/en-us/products/2010-editions/express
•
Microsoft.NETFramework4
•
TheKinectforWindowsSDK(x86orx64):http://www.kinectforwindows.com
•
ForC++SkeletalViewersamples:
•
DirectXSoftwareDevelopmentKit,June2010orlaterversion:
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=6
812
•
DirectXEnd-UserRuntimeWebInstaller:
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=3
5
TotakefulladvantageoftheaudiocapabilitiesofKinect,youwillalsoneedadditionalMicrosoft
speechrecognitionsoftware:theSpeechPlatformAPI,theSpeechPlatformSDK,andtheKinectfor
WindowsRuntimeLanguagePack.Fortunately,theinstallfortheSDKautomaticallyinstallsthese
additionalcomponentsforyou.Shouldyoueveraccidentallyuninstallthesespeechcomponents,
11
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
however,itisimportanttobeawarethattheotherKinectfeatures,suchasdepthprocessingand
skeletontracking,arefullyfunctionalevenwithoutthespeechcomponents.
Step-By-StepInstallation
BeforeinstallingtheKinectforWindowsSDK:
1.
VerifythatyourKinectdeviceisnotpluggedintothecomputeryouare
installingto.
2.
VerifythatVisualStudioisclosedduringtheinstallationprocess.
IfyouhaveotherKinectdriversonyourcomputersuchasthoseprovidedbyPrimeSense,you
shouldconsiderremovingthese.Theywillnotrunside-by-sidewiththeSDKandtheKinectdrivers
providedbyMicrosoftwillnotinteroperatewithotherKinectlibrariessuchasOpenNIorlibfreenect.It
ispossibletoinstallanduninstalltheSDKontopofotherKinectplatformsandswitchbackandforthby
repeatedlyuninstallingandreinstallingtheSDK.However,thishasalsobeenknowntocause
inconsistencies,asthewrongdrivercanoccasionallybeloadedwhenperformingthisprocedure.Ifyou
plantogobackandforthbetweendifferentKinectstacks,installingonseparatemachinesisthesafest
path.
Touninstallotherdrivers,includingpreviousversionsofthoseprovidedwiththeSDK,goto
ProgramsandFeaturesintheControlPanel,selectthenameofthedriveryouwishtoremove,andclick
Uninstall.
Downloadtheappropriateinstallationmsi(x86orx64)foryourcomputer.Ifyouareuncertain
whetheryourversionofWindowsis32-bitor64-bit,youcanrightclickontheWindowsicononyour
desktopandgotoPropertiesinordertofindout.Youcanalsoaccessyoursysteminformationbygoing
totheControlPanelandselectingSystem.Youroperatingsystemarchitecturewillbelistednexttothe
titleSystemtype.IfyourOSis64-bit,youshouldinstallthex64version.Otherwise,installthex86
versionofthemsi.
Runtheinstalleronceitissuccessfullydownloadedtoyourmachine.FollowtheSetupwizard
promptsuntilinstallationoftheSDKiscomplete.MakesurethatKinect’sextrapowersupplyisalso
pluggedintoapowersource.YoucannowplugyourKinectdeviceintoaUSBportonyourcomputer.
OnfirstconnectingtheKinecttoyourPC,Windowswillrecognizethedeviceandbeginloadingthe
Kinectdrivers.YoumayseeamessageonyourWindowstaskbarindicatingthatthisisoccurring.When
thedrivershavefinishedloading,theLEDlightonyourKinectwillturnasolidgreen.
Youmaywanttoverifythatthedriversinstalledsuccessfully.Thisistypicallyatroubleshooting
procedureincaseyouencounteranyproblemsasyouruntheSDKsamplesorbeginworkingthrough
thecodeinthisbook.Inordertoverifythatthedriversareinstalledcorrectly,opentheControlPanel
andselectDeviceManager.AsFigure1-5shows,theMicrosoftKinectnodeinDeviceManagershould
listthreeitemsifthedriverswerecorrectlyinstalled:theMicrosoftKinectAudioArrayControl,Microsoft
KinectCamera,andMicrosoftKinectSecurityControl.
12
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
Figure1-5.Kinectdrivers
YouwillalsowanttoverifythatKinect’smicrophonearraywascorrectlyrecognizedduring
installation.Todoso,gototheControlManagerandthentheDeviceManageragain.AsFigure1-6
shows,thelistingforKinectUSBAudioshouldbepresentunderthesound,videoandgamecontrollers
node.
Figure1-6.Microphonearray
IfyoufindthatanyofthefourdevicesmentionedabovedonotappearinDeviceManager,you
shoulduninstalltheSDKandattempttoinstallitagain.Themostcommonproblemsseemtooccur
aroundhavingtheKinectdeviceaccidentallypluggedintothePCduringinstallorforgettingtoplugin
theKinectadapterwhenconnectingtheKinecttothePCforthefirsttime.Youmayalsofindthatother
USBdevices,suchasawebcam,stopworkingonceKinectstartsworking.ThisoccursbecauseKinect
mayconflictwithotherUSBdevicesconnectedtothesamehostcontroller.Youcanworkaroundthisby
tryingotherUSBports.APCorlaptoptypicallyhasonehostcontrollerfortheportsonthefrontorside
ofthecomputerandanotherhostcontrollerattheback.AlsousedifferentUSBhostcontrollersifyou
attempttodaisychainmultipleKinectdevicesforthesameapplication.
Toworkwithspeechrecognition,installtheMicrosoftSpeechPlatformServerRuntime(x86),the
SpeechPlatformSDK(x86),andtheKinectforWindowsLanguagePack.Theseinstallsshouldoccurin
theorderlisted.WhilethefirsttwocomponentsarenotspecifictoKinectandcanbeusedforgeneral
speechrecognitiondevelopment,theKinectlanguagepackcontainstheacousticmodelsspecifictothe
13
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
Kinect.ForKinectdevelopment,theKinectlanguagepackcannotbereplacedwithanotherlanguage
packandtheKinectlanguagepackwillnotbeusefultoyouwhendevelopingspeechrecognition
applicationswithoutKinect.
ElementsofaKinectVisualStudioProject
IfyouarealreadyfamiliarwiththedevelopmentexperienceusingVisualStudio,thenthebasicstepsfor
implementingaKinectapplicationshouldseemfairlystraightforward.Yousimplyhaveto:
1.
Createanewproject.
2.
ReferencetheMicrosoft.Kinect.dll.
3.
DeclaretheappropriateKinectnamespace.
ThemainhurdleinprogrammingforKinectisgettingusedtotheideathatwindows,themainUI
containerof.NETprograms,arenotusedforinputastheyareintypicalapplications.Instead,windows
areusedtodisplayinformationonlywhileallinputisderivedfromtheKinectsensor.Asecondhurdleis
gettingusedtothenotionthatinputfromKinectiscontinuousandconstantlychanging.AKinect
programdoesnotwaitforadiscreteeventsuchasabuttonpress.Instead,itrepeatedlyprocesses
informationfromtheRGB,depth,andskeletonstreamsandrearrangestheUIcontainerappropriately.
TheKinectSDKsupportsthreekindsofmanagedapplications(applicationsthatuseC#orVisual
BasicratherthanC++):Consoleapplications,WPFapplications,andWindowsFormsapplications.
Consoleapplicationsareactuallytheeasiesttogetstartedwith,astheydonotcreatetheexpectation
thatwemustinteractwithUIelementslikebuttons,dropdowns,orcheckboxes.
TocreateanewKinectapplication,openVisualStudioandselectFile➤New➤Project.Adialog
windowwillappearofferingyouachoiceofprojecttemplates.UnderVisualC#➤Windows,select
ConsoleApplicationandeitheracceptthedefaultnamefortheprojectorcreateyourownprojectname.
YouwillnowwanttoaddareferencetotheKinectassemblyyouinstalledinthestepsabove.Inthe
VisualStudioSolutionspane,right-clickonthereferencesfolder,asshowninFigure1-7.SelectAdd
Reference.Anewdialogwindowwillappearlistingvariousassembliesyoucanaddtoyourproject.Find
theMicrosoft.Research.Kinectassemblyandaddittoyourproject.
Figure1-7.AddareferencetotheKinectlibrary
AtthetopoftheProgram.csfileforyourapplication,addthenamespacedeclarationforthe
Mirosoft.Kinectnamespace.ThisnamespaceencapsulatesalloftheKinectfunctionalityforbothnui
andaudio.
14
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
using Microsoft.Kinect;
ThreeadditionalstepsarestandardforKinectapplicationsthattakeadvantageofthedatafromthe
cameras.TheKinectSensorobjectmustbeinstantiated,initialized,andthenstarted.Tobuildan
extremelytrivialapplicationtodisplaythebitstreamflowingfromthedepthcamera,wewillinstantiate
anewKinectSensorobjectaccordingtotheexampleinListing1-1.Inthiscase,weassumethereisonly
onecameraintheKinectSensorsarray.Weinitializethesensorbyenablingthedatastreamswewishto
use.Enablingdatastreamswedonotintendtousewouldcauseunnecessaryperformanceoverhead.
NextweaddaneventhandlerfortheDepthFrameReadyevent,andthencreatealoopthatwaitsuntilthe
spacebarispressedbeforeendingtheapplication.Asafinalstep,justbeforetheapplicationexits,we
followgoodpracticeanddisablethedepthstreamreader.
Listing1-1.InstantiateandInitializetheRuntime
static void Main(string[] args)
{
// instantiate the sensor instance
KinectSensor sensor = KinectSensor.KinectSensors[0];
// initialize the cameras
sensor.DepthStream.Enable();
sensor.DepthFrameReady += sensor_DepthFrameReady;
// make it look like The Matrix
Console.ForegroundColor = ConsoleColor.Green;
}
// start the data streaming
sensor.Start();
while (Console.ReadKey().Key != ConsoleKey.Spacebar) { }
TheheartofanyKinectappisnotthecodeabove,whichisprimarilyboilerplate,butratherwhatwe
choosetodowiththedatapassedbytheDepthFrameReadyevent.AllofthecoolKinectapplicationsyou
haveseenontheInternetusethedatafromtheDepthFrameReady,ColorFrameReady,and
SkeletonFrameReadyeventstoaccomplishtheremarkableeffectsthathavebroughtyoutothisbook.In
Listing1-2,wewillfinishofftheapplicationbysimplywritingtheimagebitsfromthedepthcamerato
theconsolewindowtoseesomethingsimilartowhattheearlyKinecthackerssawandgotexcitedabout
backinNovemberof2010.
15
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
Listing1-2.FirstPeekAttheKinectDepthStreamData
static void sensor_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
using (var depthFrame = e.OpenDepthImageFrame())
{
if (depthFrame == null)
return;
short[] bits = new short[depthFrame.PixelDataLength];
depthFrame.CopyPixelDataTo(bits);
foreach (var bit in bits)
Console.Write(bit);
}
}
AsyouwaveyourarmsinfrontoftheKinectsensor,youwillexperiencethefirstoddityof
developingwithKinect.YouwillrepeatedlyhavetopushyourchairawayfromtheKinectsensorasyou
testyourapplications.Ifyoudothisinanopenspacewithco-workers,youwillreceivestrangelooks.I
highlyrecommendprogrammingforKinectinaprivate,secludedspacetoavoidthesestrangelooks.In
myexperience,peoplegenerallyviewasoftwaredeveloperwildlyswinginghisarmswithconcernand,
moreoften,suspicion.
TheKinectSDKSampleApplications
TheKinectforWindowsSDKinstallsseveralreferenceapplicationsandsamples.Theseapplications
provideastartingpointforworkingwiththeSDK.TheyarewritteninacombinationofC#andC++and
servethesometimescontraryobjectivesofshowinginaclearwayhowtousetheKinectSDKand
presentingbestpracticesforprogrammingwiththeSDK.Whilethisbookdoesnotdelveintothedetails
ofprogramminginC++,itisstillusefultoexaminetheseexamplesifonlytoremindourselvesthatthe
KinectSDKisbasedonaC++librarythatwasoriginallywrittenforgamedevelopersworkinginC++.The
C#classesareoftenmerelywrappersfortheseunderlyinglibrariesand,attimes,exposeleaky
abstractionsthatmakesenseonlywhenweconsidertheirC++underpinnings.
Awordshouldbesaidaboutthedifferencebetweensampleapplicationsandreference
applications.Thecodeforthisbookissamplecode.Itdemonstratesintheeasiestwaypossiblehowto
performgiventasksrelatedtothedatareceivedfromtheKinectsensor.Itshouldrarelybeusedasisin
yourownapplications.Thecodeinreferenceapplications,ontheotherhand,hastheadditionalburden
ofshowingthebestwaytoorganizecodetomakeitrobustandtoembodygoodarchitecturalprinciples.
Oneofthegreatestmythsinthesoftwareindustryisperhapstheimplicitbeliefthatgoodarchitectureis
alsoreadableand,consequently,easilymaintainable.Thisisoftennotthecase.Goodarchitecturecan
oftenbeanendinitself.MostofthecodeprovidedwiththeKinectSDKembodiesgoodarchitectureand
shouldbestudiedwiththisinmind.Thecodeprovidedwiththisbook,ontheotherhand,istypically
writtentoillustrateconceptsinthemoststraightforwardwaypossible.Youshouldstudybothcode
samplesaswellasreferencecodetobecomeaneffectiveKinectdeveloper.Inthefollowingsections,we
willintroduceyoutosomeofthesesamplesandhighlightpartsofthecodeworthfamiliarizingyourself
with.
16
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
KinectExplorer
KinectExplorerisaWPFprojectwritteninC#.Itdemonstratesthebasicprogrammingmodelfor
retrievingthecolor,depth,andskeletonstreamsanddisplayingtheminawindow—moreorlessthe
originalcriteriasetfortheAdaFruitKinecthackingcontest.Figure1-8showstheUIforthereference
application.Thevideoanddepthstreamsareeachusedtopopulateandupdateadifferentimage
controlinrealtimewhiletheskeletonstreamisusedtocreateaskeletaloverlayontheseimages.
Besidesthedepthstream,videostream,andskeleton,theapplicationalsoprovidesarunningupdateof
theframespersecondprocessedbythedepthstream.Whilethegoalis30fps,thiswilltendtovary
dependingonthespecificationsofyourcomputer.
Figure1-8.KinectExplorerreferenceapplication
Thesampleexposessomekeyconceptsforworkingwiththedifferentdatastreams.The
DepthFrameReadyeventhandler,forinstance,takeseachimageprovidedsequentiallybythedepth
streamandparsesitinordertodistinguishplayerpixelsfrombackgroundpixels.Eachimageisbroken
downintoabytearray.Eachbyteistheninspectedtodetermineifitisassociatedwithaplayerimageor
not.Ifitdoesbelongtoaplayer,thepixelisreplacedwithaflatcolor.Ifnot,itisgrayscaled.Thebytes
arethenrecasttoabitmapobjectandsetasthesourceforanimagecontrolintheUI.Thentheprocess
beginsagainforthenextimageinthedepthstream.Onewouldexpectthatindividuallyinspectingevery
byteinthisstreamwouldtakearemarkablylongtimebut,asthefpsindicatorshows,infactitdoesnot.
Thisisactuallytheprevailingtechniqueformanipulatingboththecoloranddepthstreams.Wewillgo
intogreaterdetailconcerningthedepthandcolorstreamsinChapter2andChapter3ofthisbook.
KinectExplorerisparticularlyinterestingbecauseitdemonstrateshowtobreakupthedifferent
capabilitiesoftheKinectsensorintoreusablecomponents.Insteadofacentralcontrollingprocess,each
ofthedistinctviewercontrolsforvideo,color,skeleton,andaudioindependentlycontroltheirown
17
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
accesstotheirrespectivedatastreams.ThisdistributedstructureallowsthevariousKinectcapabilities
tobeaddedindependentlyandadhoctoanyapplication.
Beyondthisinterestingmodulardesign,therearethreespecificpiecesoffunctionalityinKinect
ExplorerthatshouldbeincludedinanyKinectapplication.ThefirstisthewayKinectExplorer
implementssensordiscovery.AsListing1-3shows,thetechniqueimplementedinthereference
applicationwaitsforKinectsensorstobeconnectedtoaUSBportonthecomputer.Itdefersany
initializationofthestreamsuntilKinecthasbeenconnectedandisabletosupportmultipleKinects.
Thiscodeeffectivelyactsasagatekeeperthatpreventsanyproblemsthatmightoccurwhenthereisa
disruptioninthedatastreamscausedbytrippingoverawireorevensimplyforgettingtopluginthe
Kinectsensor.
Listing1-3.KinectSensorDiscovery
private void KinectStart()
{
//listen to any status change for Kinects.
KinectSensor.KinectSensors.StatusChanged += Kinects_StatusChanged;
//show status for each sensor that is found now.
foreach (KinectSensor kinect in KinectSensor.KinectSensors)
{
ShowStatus(kinect, kinect.Status);
}
}
AsecondnoteworthyfeatureofKinectExploreristhewayitmanagesKinectsensor’smotor
controllingthesensor’sangleofelevation.InearlyeffortstoprogramwithKinectpriortothearrivalof
theSDK,itwasuncommontousesoftwaretoraiseandlowertheangleoftheKinecthead.Inorderto
placeKinectcamerascorrectlywhileprogramming,developerswouldmanuallyliftandlowertheangle
oftheKinecthead.Thistypicallyproducedaloudandslightlyfrighteningclickbutwasconsidereda
necessaryevilasdevelopersexperimentedwithKinect.Unfortunately,Kinect’sinternalmotorswerenot
builttohandlethiskindofstress.TherathersophisticatedcodeprovidedwithKinectExplorer
demonstrateshowtoperformthisnecessarytaskinamoregenteelmanner.
Thefinalpieceoffunctionalitydeservingofcarefulstudyisthewayskeletonsfromtheskeleton
streamareselected.TheSDKonlytracksfullskeletonsfortwoplayersatatime.Bydefault,itusesa
complicatedsetofrulestodeterminewhichplayersshouldbetrackedinthisway.However,theSDK
alsoallowsthisdefaultsetofrulestobeoverwrittenbytheKinectdeveloper.KinectExplorer
demonstrateshowtooverwritethebasicrulesandalsoprovidesseveralalternativealgorithmsfor
determiningwhichplayersshouldreceivefullskeletontracking,forinstancebyclosestplayersandby
mostphysicallyactiveplayers.
ShapeGame
TheShapeGamereferenceapp,alsoaWPFapplicationwritteninC#,isanambitiousprojectthatties
togetherskeletontracking,speechrecognition,andbasicphysicssimulation.Italsosupportsuptotwo
playersatthesametime.TheShapeGameintroducestheconceptofagameloop.Thoughnotdealtwith
explicitlyinthisbook,gameloopsareacentralconceptingamedevelopmentthatyouwillwantto
becomefamiliarwithinordertopresentshapesconstantlyfallingfromthetopofthescreen.InShape
Game,thegameloopisaC#whilelooprunningintheGameThreadmethod,asshowninListing1-4.The
GameThreadmethodtweakstherateofthegamelooptoachievetheoptimalframerate.Onevery
18
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
iterationofthewhileloop,theHndleGameTimermethodiscalledtomoveshapesdownthescreen,add
newshapes,anddetectcollisionsbetweentheskeletonhandjointsandthefallingshapes.
Listing1-4.ABasicGameLoop
private void GameThread()
{
runningGameThread = true;
predNextFrame = DateTime.Now;
actualFrameTime = 1000.0 / targetFramerate;
while (runningGameThread)
{
. . .
}
Dispatcher.Invoke(DispatcherPriority.Send,
new Action<int>(HandleGameTimer), 0);
}
TheresultisthegameinterfaceshowninFigure1-9.WhiletheShapeGamesampleusesprimitive
shapesforgamecomponentssuchaslinesandellipsesfortheskeleton,itisalsofairlyeasytoreplace
theseshapeswithimagesinordertocreateamoreengagingexperience.
Figure1-9.ShapeGame
TheShapeGamealsointegratesspeechrecognitionintothegameplay.Thelogicforthespeech
recognitioniscontainedintheproject’sRecognizerclass.Itrecognizesphrasesofuptofivewordswith
approximately15possiblewordchoicesforeachword,potentiallysupportingagrammarofupto
700,000phrases.Thecombinationofgestureandspeechrecognitionprovidesawaytoexperimentwith
mixed-modalgameplaywithKinect,somethingnotwidelyusedinKinectgamesfortheXboxbutaround
whichthereisconsiderableexcitement.Thisbookdelvesintothespeechrecognitioncapabilitiesof
KinectinChapter7.
19
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
NoteTheskeletontrackingintheShapeGamesampleprovidedwiththeKinectforWindowsSDKhighlightsa
commonproblemwithstraightforwardrenderingofjointcoordinates.Whenaparticularbodyjointfallsoutsideof
thecamera’sview,thejointbehaviorbecomeserratic.Thisismostnoticeablewiththelegs.Abestpracticeisto
createdefaultpositionsandmovementsforin-gameavatars.Thedefaultpositionsshouldonlybeoverriddenwhen
theskeletaldataforparticularjointsisvalid.
RecordAudio
TheRecordAudiosampleistheC#versionofsomeofthefeaturesdemonstratedinAudioCaptureRaw,
MFAudioFilter,andMicArrayEchoCancellation.ItisaC#consoleapplicationthatrecordsandsavesthe
rawaudiofromKinectasawavfile.Italsoappliesthesourcelocalizationfunctionalityshownin
MicArrayEchoCancellationtoindicatethesourceoftheaudiowithrespecttotheKinectsensorin
radians.ItintroducesanimportantconceptforworkingwithwavdatacalledtheWAVEFORMATEX
struct.ThisisastructurenativetoC++thathasbeenreimplementedasaC#structinRecordAudio,as
showninListing1-5.Itcontainsalltheinformation,andonlytheinformation,requiredtodefineawav
audiofile.TherearealsomultipleC#implementationsofitalloverthewebsinceitseemstobe
reinventedeverytimesomeoneneedstoworkwithwavfilesinmanagedcode.
Listing1-5.TheWAVEFORMATEXStruct
struct WAVEFORMATEX
{
public ushort wFormatTag;
public ushort nChannels;
public uint nSamplesPerSec;
public uint nAvgBytesPerSec;
public ushort nBlockAlign;
public ushort wBitsPerSample;
public ushort cbSize;
}
SpeechSample
TheSpeechsampleapplicationdemonstrateshowtouseKinectwiththespeechrecognitionengine
providedintheMicrosoft.Speechassembly.SpeechisaconsoleapplicationwritteninC#.Whereasthe
MFAudioFiltersampleusedaWMAfileasitssink,theSpeechapplicationusesthespeechrecognition
engineasasinkinitsaudioprocessingpipeline.
Thesampleisfairlystraightforward,demonstratingtheconceptsofGrammarobjectsandChoices
objects,asshowninListing1-6,thathavebeenapartofspeechrecognitionprogrammingsince
WindowsXP.Theseobjectsareconstructedtocreatecustomlexiconsofwordsandphrasesthatthe
applicationisconfiguredtorecognize.InthecaseoftheSpeechsample,thisincludesonlythreewords:
red,green,andblue.
20
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
Listing1-6.GrammarsandChoices
var colors = new Choices();
colors.Add("red");
colors.Add("green");
colors.Add("blue");
var gb = new GrammarBuilder();
gb.Culture = ri.Culture;
gb.Append(colors);
var g = new Grammar(gb);
ThesamplealsointroducessomewidelyusedboilerplatecodethatusesC#LINQsyntaxto
instantiatethespeechrecognitionengine,asillustratedinListing1-7.Instantiatingthespeech
recognitionenginerequiresusingpatternmatchingtoidentifyaparticularstring.Thespeech
recognitionengineeffectivelyloopsthroughalltherecognizersinstalledonthecomputeruntilitfinds
onewhoseIdpropertymatchesthemagicstring.Inthiscase,weuseaLINQexpressiontoperformthe
loop.Ifthecorrectrecognizerisfound,itisthenusedtoinstantiatethespeechrecognitionengine.Ifitis
notfound,thespeechrecognitionenginecannotbeused.
Listing1-7.FindingtheKinectRecognizer
private static RecognizerInfo GetKinectRecognizer()
{
Func<RecognizerInfo, bool> matchingFunc = r =>
{
string value;
r.AdditionalInfo.TryGetValue("Kinect", out value);
return "True".Equals(value, StringComparison.InvariantCultureIgnoreCase)
&& "en-US".Equals(
r.Culture.Name
, StringComparison.InvariantCultureIgnoreCase);
};
return SpeechRecognitionEngine.InstalledRecognizers()
.Where(matchingFunc)
.FirstOrDefault();
}
Althoughsimple,theSpeechsampleisagoodstartingpointforexploringtheMicrosoft.SpeechAPI.
Aproductivewaytousethesampleistobeginaddingadditionalwordchoicestothelimitedthree-word
targetset.ThentrytocreatetheTellMestylefunctionalityontheXboxbyignoringanyphrasethatdoes
notbeginwiththeword“Xbox.”Thentrytocreateagrammarthatincludescomplexgrammatical
structuresthatincludeverbs,subjectsandobjectsastheShapeGameSDKsampledoes.
Thisis,afterall,thechiefutilityofthesampleapplicationsprovidedwiththeKinectforWindows
SDK.Theyprovidecodeblocksthatyoucancopydirectlyintoyourowncode.Theyalsoofferawayto
beginlearninghowtogetthingsdonewithKinectwithoutnecessarilyunderstandingalloftheconcepts
behindtheKinectAPIrightaway.Iencourageyoutoplaywiththiscodeassoonaspossible.Whenyou
hitawall,returntothisbooktolearnmoreaboutwhytheKinectAPIworksthewayitdoesandhowto
getfurtherinimplementingthespecificscenariosyouareinterestedin.
21
www.it-ebooks.info
CHAPTER1GETTINGSTARTED
Summary
Inthischapter,youlearnedaboutthesurprisinglylonghistoryofgesturetrackingasadistinctmodeof
naturaluserinterface.YoualsolearnedaboutthecentralroleAlexKipmanplayedinbringingKinect
technologytotheXboxandhowMicrosoftResearch,Microsoft’sresearchanddevelopmentgroup,was
usedtobringKinecttomarket.YoufoundoutthemomentumonlinecommunitieslikeOpenKinect
addedtowardpopularizingKinectdevelopmentbeyondXboxgaming,openingupanewtrendinKinect
developmentonthePC.YoulearnedhowtoinstallandstartprogrammingfortheMicrosoftKinectfor
WindowsSDK.Finally,youlearnedaboutthevariouspiecesinstalledwiththeKinectforWindowsSDK
andhowtousethemasaspringboardforyourownprogrammingaspirations.
22
www.it-ebooks.info
CHAPTER 2
Application Fundamentals
EveryKinectapplicationhascertainbasicelements.Theapplicationmustdetectordiscoverattached
Kinectsensors.Itmusttheninitializethesensor.Onceinitialized,thesensorproducesdata,whichthe
applicationthenprocesses.Finally,whentheapplicationfinishesusingthesensoritmustproperly
uninitializedthesensor.
Inthefirstsectionofthischapter,wecoversensordiscovery,initialization,anduninitialization.
ThesefundamentaltopicsarecriticaltoallformsofKinectapplicationsusingMicrosoft’sKinectfor
WindowsSDK.Thefirstsectionpresentsseveralcodeexamples,whichshowcodethatisnecessaryfor
virtuallyanyKinectapplicationyouwriteandisrequiredbyeverycodingprojectinthisbook.The
codingdemonstrationsinthesubsequentchaptersdonotexplicitlyshowthesensordiscoveryand
initializationcode.Instead,theysimplymentionthistaskasaprojectrequirement.
Onceinitialized,Kinectgeneratesdatabasedoninputgatheredbythedifferentcameras.Thisdata
isavailabletoapplicationsthroughdatastreams.TheconceptissimilartotheIOstreamsfoundinthe
System.IOnamespace.Thesecondsectionofthischapterdetailsstreambasicsanddemonstrateshowto
pulldatafromKinectusingtheColorImageStream.Thisstreamcreatespixeldata,whichallowsan
applicationtocreateacolorimagelikeabasicphotoorvideocamera.Weshowhowtomanipulatethe
streamdatainfunandinterestingways,andweexplainhowtosavestreamdatatoenhanceyourKinect
application’suserexperience.
Thefinalsectionofthischaptercomparesandcontraststhetwoapplicationarchitecturemodels
(EventandPolling)availablewithMicrosoft’sKinectforWindowsSDK.Wedetailhowtouseeach
architecturestructureandwhy.Thisincludescodeexamples,whichcanserveastemplatesforyournext
project.
Thischapterisanecessaryreadasitisthefoundationfortheremainingchaptersofthebook,and
becauseitcoversthebasicsoftheentireSDK.Afterreadingthischapter,findingyourwaythroughthe
restoftheSDKiseasy.TheadventureinKinectapplicationdevelopmentbeginsnow.Havefun!
23
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
TheKinectSensor
KinectapplicationdevelopmentstartswiththeKinectSensor.ThisobjectdirectlyrepresentstheKinect
hardware.ItisfromtheKinectSensorobjectthatyouaccessthedatastreamsforvideo(color)anddepth
imagesaswellasskeletontracking.Inthischapter,weexploreonlytheColorImageStream.The
DepthImageStreamandSkeletonStreamwarrantentirechapterstothemselves.
Themostcommonmethodofdataretrievalfromthesensor’sstreamsisfromasetofeventsonthe
KinectSensorobject.Eachstreamhasanassociatedevent,whichfireswhenthestreamhasaframeof
dataavailableforprocessing.Eachstreampackagesdatainwhatistermedaframe.Forexample,the
ColorFrameReadyeventfireswhentheColorImageStreamhasnewdata.Weexamineeachoftheseevents
inmoredepthwhencoveringtheparticularsensorstream.Moregeneraleventingdetailsareprovided
laterinthischapter,whendiscussingthetwodifferentdataretrievalarchitecturemodels.
Eachofthedatastreams(color,depthandskeleton)returndatapointsindifferentcoordinates
systems,asweexploreeachdatastreamindetailthiswillbecomeclearer.Itisacommontaskto
translatedatapointsgeneratedinonestreamtoadatapointinanother.Laterinthischapterwe
demonstratehowandwhypointtranslatesareneeded.TheKinectSensorobjecthasasetofmethodsto
performthedatastreamdatapointtranslations.TheyareMapDepthToColorImagePoint,
MapDepthToColorImagePoint,MapDepthToSkeletonPoint,andMapSkeletonPointToDepth.Beforeweareable
toworkwithanyKinectdata,wemustfindanattachedKinect.Theprocessofdiscoveringconnected
sensorsiseasy,butrequiressomeexplanation.
DiscoveringConnectedaSensor
TheKinectSensorobjectdoesnothaveapublicconstructorandcannotbecreatedbyanapplication.
Instead,theSDKcreatesKinectSensorobjectswhenitdetectsanattachedKinect.Theapplicationmust
discoverorbenotifiedwhenKinectisattachedtothecomputer.TheKinectSensorclasshasastatic
propertynamedKinectSenors.ThispropertyisoftypeKinectSensorCollection.The
KinectSensorCollectionobjectinheritsfromReadOnlyCollectionandissimple,asitconsistsonlyofan
indexerandaneventnamedStatusChanged.
TheindexerprovidesaccesstoKinectSensorobjects.Thecollectioncountisalwaysequaltothe
numberofattachedKinects.Yes,thismeansitispossibletobuildapplicationsthatusemorethanone
Kinect!YourapplicationcanuseasmanyKinectsasdesired.Youareonlylimitedbythemuscleofthe
computerrunningyourapplication,becausetheSDKdoesnotrestrictthenumberofdevices.Because
ofthepowerandbandwidthneedsofKinect,eachdevicerequiresaseparateUSBcontroller.
Additionally,whenusingmultipleKinectsonasinglecomputer,theCPUandmemorydemands
necessitatesomeserioushardware.Giventhis,weconsidermulti-Kinectapplicationsanadvancedtopic
thatisbeyondthescopeofthisbook.Throughoutthisbook,weonlyeverconsiderusingasingleKinect
device.Allcodeexamplesinthisbookarewrittentouseasingledeviceandignoreallotherattached
Kinects.
FindinganattachedKinectisaseasyasiteratingthroughthecollection;however,justthepresence
ofaKinectSensorcollectiondoesnotmeanitisdirectlyusable.TheKinectSensorobjecthasaproperty
namedStatus,whichindicatesthedevice’sstate.Theproperty’stypeisKinectStatus,whichisan
enumeration.Table2-1liststhedifferentstatusvalues,andexplainstheirmeaning.
24
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
Table2-1KinectStatusValuesandSignificance
KinectStatus
What it means
Undefined
Thestatusoftheattacheddevicecannotbedetermined.
Connected
Thedeviceisattachedandiscapableofproducingdatafromitsstreams.
DeviceNotGenuine
TheattacheddeviceisnotanauthenticKinectsensor.
Disconnected
TheUSBconnectionwiththedevicehasbeenbroken.
Error
Communicationwiththedeviceproduceserrors.
Initializing
Thedeviceisattachedtothecomputer,andisgoingthroughtheprocess
ofconnecting.
InsufficientBandwidthKinect
cannotinitialize,becausetheUSBconnectordoesnothavethe
necessarybandwidthrequiredtooperatethedevice.
NotPowered
Kinectisnotfullypowered.ThepowerprovidedbyaUSBconnectionis
notsufficienttopowertheKinecthardware.Anadditionalpoweradapter
isrequired.
NotReady
Kinectisattached,butisyettoentertheConnectedstate.
AKinectSensorcannotbeinitializeduntilitreachesaConnectedstatus.Duringanapplication’s
lifespan,asensorcanchangestate,whichmeanstheapplicationmustmonitorthestatechangesof
attacheddevicesandreactappropriatelytothestatuschangeandtotheneedsoftheuserexperience.
Forexample,iftheUSBcableisremovedfromthecomputer,thesensor’sstatuschangesto
Disconnected.Ingeneral,anapplicationshouldpauseandnotifytheusertoplugKinectbackintothe
computer.AnapplicationmustnotassumeKinectwillbeconnectedandreadyforuseatstartup,orthat
thesensorwillmaintainconnectivitythroughoutthelifeoftheapplication.
CreateanewWPFprojectusingVisualStudiosothatwecanproperlydemonstratethediscovery
process.AddareferencetoMicrosoft.Kinect.dll,andupdatetheMainWindow.xaml.cscode,asshownin
Listing2-1.ThecodelistingshowsthebasiccodetodetectandmonitoraKinectsensor.
Listing2-1DetectingaKinectSensor
public partial class MainWindow : Window
{
#region Member Variables
private KinectSensor _Kinect;
#endregion Member Variables
#region Constructor
public MainWindow()
25
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
{
InitializeComponent();
this.Loaded += (s, e) => { DiscoverKinectSensor(); };
this.Unloaded += (s, e) => { this.Kinect = null; };
}
#endregion Constructor
#region Methods
private void DiscoverKinectSensor()
{
KinectSensor.KinectSensors.StatusChanged += KinectSensors_StatusChanged;
this.Kinect = KinectSensor.KinectSensors
.FirstOrDefault(x => x.Status == KinectStatus.Connected);
}
private void KinectSensors_StatusChanged(object sender, StatusChangedEventArgs e)
{
switch(e.Status)
{
case KinectStatus.Connected:
if(this.Kinect == null)
{
this.Kinect = e.Sensor;
}
break;
case KinectStatus.Disconnected:
if(this.Kinect == e.Sensor)
{
this.Kinect = null;
this.Kinect = KinectSensor.KinectSensors
.FirstOrDefault(x => x.Status == KinectStatus.Connected);
if(this.Kinect == null)
{
//Notify the user that the sensor is disconnected
}
}
break;
}
//Handle all other statuses according to needs
}
#region Properties
public KinectSensor Kinect
{
get { return this._Kinect; }
26
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
set
{
if(this._Kinect != value)
{
if(this._Kinect != null)
{
//Uninitialize
this._Kinect = null;
}
}
if(value != null && value.Status == KinectStatus.Connected)
{
this._Kinect = value;
//Initialize
}
}
}
#endregion Properties
}
AnexaminationofthecodeinListing2-1beginswiththemembervariable_Kinectandtheproperty
namedKinect.AnapplicationshouldalwaysmaintainalocalreferencetotheKinectSensorobjectsused
bytheapplication.Thereareseveralreasonsforthis,whichbecomemoreobviousasyouproceed
throughthebook;however,attheveryleast,areferenceisneededtouninitializetheKinectSensorwhen
theapplicationisfinishedusingthesensor.Thepropertyservesasawrapperforthemembervariable.
Theprimarypurposeofusingapropertyistoensureallsensorinitializationanduninitializationisina
commonplaceandexecutedinastructuredway.Noticeintheproperty’ssetterhowthemember
variableisnotsetunlesstheincomingvaluehasastatusofKinectStatus.Connected.Whengoing
throughthesensordiscoveryprocess,anapplicationshouldonlybeconcernedwithconnecteddevices.
Besides,anyattempttoinitializeasensorthatdoesnothaveaconnectedstatusresultsinan
InvalidOperationExceptionexception.
Intheconstructoraretwoanonymousmethods,onetorespondtotheLoadedeventandtheother
fortheUnloadedevent.Whenunloaded,theapplicationsetstheKinectpropertytonull,which
uninitializesthesensorusedbytheapplication.Inresponsetothewindow’sLoadedevent,the
applicationattemptstodiscoveraconnectedsensorbycallingtheDiscoverKinectSensormethod.The
primarymotivationforusingtheLoadedandUnloadedeventsoftheWindowisthattheyserveassolid
pointstobeginandendKinectprocessing.IftheapplicationfailstodiscoveravalidKinect,the
applicationcanvisuallynotifytheuser.
TheDiscoverKinectSensormethodonlyhastwolinesofcode,buttheyareimportant.Thefirstline
subscribestotheStatusChangedeventoftheKinectSensorsobject.Thesecondlineofcodeusesa
lambdaexpressiontofindthefirstKinectSensorobjectinthecollectionwithastatusof
KinectSensor.Connected.TheresultisassignedtotheKinectproperty.Thepropertysettercode
initializesanynon-nullsensorobject.
TheStatusChangedeventhandler(KinectSensors_StatusChanged)isstraightforwardandselfexplanatory.However,itisworthmentioningthecodeforwhenthestatusisequalto
KinectSensor.Connected.Thefunctionoftheifstatementistolimittheapplicationtoonesensor.The
applicationignoresanysubsequentKinectsconnectedonceonesensorisdiscoveredandinitializedby
theapplication.
27
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
ThecodeinListing2-1illustratestheminimalcoderequiredtodiscoverandmaintainareferenceto
aKinectdevice.Theneedsofeachindividualapplicationarelikelytonecessitateadditionalcodeor
processing,butthecoreremains.Asyourapplicationsbecomemoreadvanced,controlsorotherclasses
willcontaincodesimilartothis.Itisimportanttoensurethreadsafetyandreleaseresourcesproperlyfor
garbagecollectiontopreventmemoryleaks.
StartingtheSensor
Oncediscovered,Kinectmustbeinitializedbeforeitcanbeginproducingdataforyourapplication.The
initializationprocessconsistsofthreesteps.First,yourapplicationmustenablethestreamsitneeds.
EachstreamhasanEnabledmethod,whichinitializesthestream.Eachstreamisuniquelydifferentand
assuchhassettingsthatrequireconfiguringbeforeenabled.Insomecases,thesesettingsareproperties
andothersareparametersontheEnabledmethod.Laterinthischapter,wecoverinitializingthe
ColorImageStream.Chapter3detailstheinitializationprocessfortheDepthImageStreamandChapter4
givetheparticularsontheSkeletonStream.
Thenextstepisdetermininghowyourapplicationretrievesthedatafromthestreams.Themost
commonmeansisthroughasetofeventsontheKinectSensorobject.Thereisaneventforeachstream
(ColorFrameReadyfortheColorImageStream,DepthFrameReadyfortheDepthImageStream,and
SkeletonFrameReadyfortheSkeletonStream),andtheAllFramesReadyevent,whichsynchronizesthe
framedataofallthestreamssothatallframesareavailableatonce.Individualframe-readyeventsfire
onlywhentheparticularstreamisenabled,whereastheAllFramesReadyeventfireswhenoneormore
streamsisenabled.
Finally,theapplicationmuststarttheKinectSensorobjectbycallingtheStartmethod.Almost
immediatelyaftercallingtheStartmethod,theframe-readyeventsbegintofire.Ensurethatyour
applicationispreparedtohandleincomingKinectdatabeforestartingtheKinectSensor.
StoppingtheSensor
Oncestarted,theKinectSensorisstoppedbycallingtheStopmethod.Alldataproductionstops,
howeveryoucanexpecttheframe-readyeventstofireonelasttime,soremembertoaddchecksfornull
frameobjectsinyourframe-readyeventhandlers.Theprocesstostopthesensorisstraightforward
enough,butthemotivationsfordoingsoaddpotentialcomplexity,whichcanaffectthearchitectureof
yourapplication.
ItistoosimplistictothinkthattheonlyreasonforhavingtheStopmethodisthateveryonswitch
mustalsohaveanoffposition.TheKinectSensorobjectanditsstreamsusesystemresourcesandall
well-behavedapplicationsshouldproperlyreleasetheseresourceswhennolongerneeded.Inthiscase,
theapplicationwouldnotonlystopthesensor,butalsowouldunsubscribefromtheframe-readyevent
handlers.BecarefulnottocalltheDisposemethodontheKinectSensororthestreams.Thisprevents
yourapplicationfromaccessingthesensoragain.Theapplicationmustberestartedorthesensormust
beunpluggedandpluggedinagain,beforethedisposedsensorisagainavailableforuse.
TheColorImageStream
Kinecthastwocameras:anIRcameraandanormalvideocamera.Thevideocameraproducesabasic
colorvideofeedlikeanyoff-the-shelfvideocameraorwebcam.Thisstreamistheleastcomplexofthe
threebythewayitdataproducesandconfigurationsettings.Therefore,itservesperfectlyasan
introductiontousingaKinectdatastream.
WorkingwithaKinectdatastreamisathree-stepprocess.Thestreammustfirstbeenabled.Once
enabled,theapplicationextractsframedatafromthestream,andfinallytheapplicationprocessesthe
28
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
framedata.Thelasttwostepscontinueoverandoverforaslongasframedataisavailable.Continuing
withthecodefromListing2-1,wecodetoinitializetheColorImageStream,asshowninListing2-2.
Listing2-2EnablingtheColorImageStream
public KinectSensor Kinect
{
get { return this._Kinect; }
set
{
if(this._Kinect != value)
{
if(this._Kinect != null)
{
UninitializeKinectSensor(this._Kinect);
this._Kinect = null;
}
}
}
if(value != null && value.Status == KinectStatus.Connected)
{
this._Kinect = value;
InitializeKinectSensor(this._Kinect);
}
}
private void InitializeKinectSensor(KinectSensor sensor)
{
if(sensor != null)
{
sensor.ColorStream.Enable();
sensor.ColorFrameReady += Kinect_ColorFrameReady;
sensor.Start();
}
}
private void UninitializeKinectSensor(KinectSensor sensor)
{
if(sensor != null)
{
sensor.Stop();
sensor.ColorFrameReady -= Kinect_ColorFrameReady;
}
}
ThefirstpartofListing2-2showstheKinectpropertywithupdatesinbold.Thetwonewlinescall
twonewmethods,whichinitializeanduninitializetheKinectSensorandtheColorImageStream.The
InitializeKinectSensormethodenablestheColorImageStream,subscribestotheColorFrameReady
29
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
event,andstartsthesensor.Oncestarted,thesensorcontinuallycallstheframe-readyeventhandler
whenanewframeofdataisavailable,whichinthisinstanceis30timespersecond.
Atthispoint,ourprojectisincompleteandfailstocompile.Weneedtoaddthecodeforthe
Kinect_ColorFrameReadyeventhandler.BeforedoingthisweneedtoaddsomecodetotheXAML.Each
timetheframe-readyeventhandleriscalled,wewanttocreateabitmapimagefromtheframe’sdata,
andweneedsomeplacetodisplaytheimage.Listing2-3showstheXAMLneededtoserviceourneeds.
Listing2-3DisplayingaColorFrameImage
<Window x:Class="BeginningKinect.Chapter2.ApplicationFundamentals.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="MainWindow" Height="480" Width="640">
<Grid>
<Image x:Name="ColorImageElement"/>
</Grid>
</Window>
Listing2-4containstheframe-readyeventhandler.Theprocessingofframedatabeginsbygetting
oropeningtheframe.TheOpenColorImageFramemethodontheColorImageFrameReadyEventArgsobject
returnsthecurrentColorImageFrameobject.Theframeobjectisdisposable,whichiswhythecodewraps
thecalltoOpenColorImageFrameinausingstatement.Extractingpixeldatafromtheframefirstrequires
ustocreateabytearraytoholdthedata.ThePixelDataLengthpropertyontheframeobjectgivesthe
exactsizeofthedataandsubsequentlythesizeofthearray.CallingtheCopyPixelDataTomethod
populatesthearraywithpixeldata.Thelastlineofcodecreatesabitmapimagefromthepixeldataand
displaystheimageontheUI.
Listing2-4ProcessingColorImageFrameData
private void Kinect_ColorFrameReady(object sender, ColorImageFrameReadyEventArgs e)
{
using(ColorImageFrame frame = e.OpenColorImageFrame())
{
if(frame != null)
{
byte[] pixelData = new byte[frame.PixelDataLength];
frame.CopyPixelDataTo(pixelData);
}
}
ColorImageElement.Source = BitmapImage.Create(frame.Width, frame.Height, 96, 96,
PixelFormats.Bgr32, null, pixelData,
frame.Width * frame.BytesPerPixel);
}
Withthecodeinplace,compileandrun.TheresultshouldbealivevideofeedfromKinect.You
wouldseethissameoutputfromawebcamoranyothervideocamera.Thisaloneisnothingspecial.The
differenceisthatitiscomingfromKinect,andasweknow,Kinectcanseethingsthatawebcamor
genericvideocameracannot.
30
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
BetterImagePerformance
ThecodeinListing2-4createsanewbitmapimageforeachcolorimageframe.Anapplicationusingthis
codeandthedefaultimageformatcreates30bitmapimagespersecond.Thirtytimespersecond
memoryforanewbitmapobjectisallocated,initialized,andpopulatedwithpixeldata.Thememoryof
thepreviousframe’sbitmapisalsomarkedforgarbagecollectionthirtytimespersecond,whichmeans
thegarbagecollectorislikelyworkingharderthaninmostapplications.Inshort,thereisalotofwork
beingdoneforeachframe.Insimpleapplications,thereisnodiscernibleperformanceloss;however,for
morecomplexandperformance-demandingapplications,thisisunacceptable.Fortunately,thereisa
betterway.
ThesolutionistousetheWriteableBitmapobject.Thisobjectispartofthe
System.Windows.Media.Imagingnamespace,andwasbuilttohandlefrequentupdatesofimagepixel
data.WhencreatingtheWriteableBitmap,theapplicationmustdefinetheimagepropertiessuchasthe
width,height,andpixelformat.ThisallowstheWriteableBitmapobjecttoallocatethememoryonceand
justupdatepixeldataasneeded.
ThecodechangesnecessarytousetheWriteableBitmapareonlyminor.Listing2-5beginsby
declaringthreenewmembervariables.ThefirstistheactualWriteableBitmapobjectandtheothertwo
areusedwhenupdatingpixeldata.Thevaluesofimagerectangleandimagestridedonotchangefrom
frametoframe,sowecancalculatethemoncewhencreatingtheWriteableBitmap.
Listing2-5alsoshowschanges,inbold,totheInitializeKinectmethod.Thesenewlinesofcode
createtheWriteableBitmapobjectandprepareittoreceivepixeldata.Theimagerectangleandstride
calculatesareincluded.WiththeWriteableBitmapcreatedandinitialized,itissettobetheimagesource
fortheUIImageelement(ColorImageElement).Atthispoint,theWriteableBitmapcontainsnopixeldata,
sotheUIimageisblank.
Listing2-5CreateaFrameImageMoreEfficiently
private WriteableBitmap _ColorImageBitmap;
private Int32Rect _ColorImageBitmapRect;
private int _ColorImageStride;
private void InitializeKinect(KinectSensor sensor)
{
if(sensor != null)
{
ColorImageStream colorStream = sensor.ColorStream;
colorStream.Enable();
this._ColorImageBitmap = new WriteableBitmap(colorStream.FrameWidth,
colorStream.FrameHeight, 96, 96,
PixelFormats.Bgr32, null);
this._ColorImageBitmapRect = new Int32Rect(0, 0, colorStream.FrameWidth,
colorStream.FrameHeight);
this._ColorImageStride = colorStream.FrameWidth * colorStream.FrameBytesPerPixel;
ColorImageElement.Source = this._ColorImageBitmap;
}
sensor.ColorFrameReady += Kinect_ColorFrameReady;
sensor.Start();
}
31
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
Tocompletethisupgrade,weneedtoreplaceonelineofcodefromtheColorFrameReadyevent
handler.Listing2-6showstheeventhandlerwiththenewlineofcodeinbold.First,deletethecodethat
createdanewbitmapfromtheframedata.ThecodeupdatestheimagepixelsbycallingtheWritePixels
methodontheWriteableBitmapobject.Themethodtakesinthedesiredimagerectangle,anarrayof
bytesrepresentingthepixeldata,theimagestride,andanoffset.Theoffsetisalwayszero,becausewe
arereplacingeverypixelintheimage.
Listing2-6UpdatingtheImagePixels
private void Kinect_ColorFrameReady(object sender, ColorImageFrameReadyEventArgs e)
{
using(ColorImageFrame frame = e.OpenColorImageFrame())
{
if(frame != null)
{
byte[] pixelData = new byte[frame.PixelDataLength];
frame.CopyPixelDataTo(pixelData);
this._ColorImageBitmap.WritePixels(this._ColorImageBitmapRect, pixelData,
this._ColorImageStride, 0);
}
}
}
AnyKinectapplicationthatdisplaysimageframedatafromeithertheColorImageStreamorthe
DepthImageStreamshouldusetheWriteableBitmaptodisplayframeimages.Inthebestcase,thecolor
streamproduces30framespersecond,whichmeansalargedemandonmemoryresourcesisrequired.
TheWriteableBitmapreducesthememoryconsumptionandsubsequentlythenumberofmemory
allocationanddeallocationoperationsneededtosupportthedemandsofconstantimageupdates.After
all,thedisplayofframedataislikelynottheprimaryfunctionofyourapplication.Therefore,youwant
imagegenerationtobeasperformantaspossible.
SimpleImageManipulation
EachColorImageFramereturnsrawpixeldataintheformofanarrayofbytes.Anapplicationmust
explicitlycreateanimagefromthisdata.Thismeansthatifsoinspired,wecanalterthepixeldatabefore
creatingtheimagefordisplay.Let’sdoaquickexperimentandhavesomefun.Addthecodeinboldin
Listing2-7tothe Kinect_ColorFrameReadyeventhandler.
Listing2-7.SeeingShadesofRed
private void Kinect_ColorFrameReady (object sender, ImageFrameReadyEventArgs e)
{
using(ColorImageFrame frame = e.OpenColorImageFrame())
{
if(frame != null)
{
byte[] pixelData = new byte[frame.PixelDataLength];
frame.CopyPixelDataTo(pixelData);
32
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
for(int i = 0; i < pixelData.Length; i += frame.BytesPerPixel)
{
pixelData[i]
= 0x00;
//Blue
pixelData[i + 1]
= 0x00;
//Green
}
}
}
this._ColorImageBitmap.WritePixels(this._ColorImageBitmapRect, pixelData,
this._ColorImageStride, 0);
}
Thisexperimentturnsofftheblueandgreenchannelsofeachpixel.TheforloopinListing2-7
iteratesthroughthebytessuchthatiisalwaysthepositionofthefirstbyteofeachpixel.Sincepixeldata
isinBgr32format,thefirstbyteisthebluechannelfollowedbygreenandred.Thetwolinesofcode
insidetheloopsettheblueandgreenbytevaluesforeachpixeltozero.Theoutputisanimagewithonly
shadesofred.Thisisaverybasicexampleofimageprocessing.
Ourloopmanipulatesthecolorofeachpixel.Thatmanipulationisactuallysimilartothefunctionof
apixelshader—algorithms,oftenverycomplex,thatmanipulatethecolorsofeachpixel.Chapter8takes
adeeperlookatusingpixelshaderswithKinect.Inthemeantime,trythesimplepseudo-pixelshadersin
thefollowinglist.Allyouhavetodoisreplacethecodeinsidetheforloop.Iencourageyouto
experimentonyourown,andresearchpixeleffectsandshaders.Bemindfulthatthistypeofprocessing
canbeveryresourceintensiveandtheperformanceofyourapplicationcouldsuffer.Pixelshadingis
generallyalow-levelprocessperformedbytheGPUonthecomputergraphicscard,andnotoftenby
high-levellanguagessuchasC#.
•
InvertedColors–Beforedigitalcameras,therewasfilm.Thisishowapicture
lookedonthefilmbeforeitwasprocessedontopaper.
pixelData[i]
= (byte) ~pixelData [i];
pixelData [i + 1] = (byte) ~pixelData [i + 1];
pixelData [i + 2] = (byte) ~pixelData [i + 2];
•
ApocalypticZombie–Inverttheredpixelandswaptheblueandgreenvalues.
pixelData [i]
= pixelData [i + 1];
pixelData [i + 1] = pixelData [i];
pixelData [i + 2] = (byte) ~pixelData [i + 2];
•
Grayscale
byte gray
gray
pixelData [i]
pixelData [i + 1]
pixelData [i + 2]
•
=
=
=
=
=
Math.Max(pixelData [i], pixelData [i + 1]);
Math.Max(gray, pixelData [i + 2]);
gray;
gray;
gray;
Grainyblackandwhitemovie
byte gray
gray
pixelData [i]
pixelData [i + 1]
=
=
=
=
Math.Min(pixelData [i], pixelData [i + 1]);
Math.Min(gray, pixelData [i + 2]);
gray;
gray;
33
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
pixelData [i + 2] = gray;
•
Washedoutcolors
double gray
= (pixelData [i] * 0.11) +
(pixelData [i + 1] * 0.59) +
(pixelData [i + 2] * 0.3);
double desaturation = 0.75;
pixelData [i]
= (byte) (pixelData [i] + desaturation *
(gray - pixelData [i]));
pixelData [i + 1]
= (byte) (pixelData [i + 1] + desaturation *
(gray - pixelData [i + 1]));
pixelData [i + 2]
= (byte) (pixelData [i + 2] + desaturation *
(gray - pixelData [i + 2]));
•
Highsaturation–Alsotryreversingthelogicsothatwhentheifconditionistrue,
thecoloristurnedon(0xFF),andwhenfalse,itisturnedoff(0x00).
if(pixelData [i] < 0x33 || pixelData [i] > 0xE5)
{
pixelData [i] = 0x00;
}
else
{
pixelData [i] = 0xFF;
}
if(pixelData [i + 1] < 0x33 || pixelData [i + 1] > 0xE5)
{
pixelData [i + 1] = 0x00;
}
else
{
pixelData [i + 1] = 0xFF;
}
if(pixelData [i + 2] < 0x33 || pixelData [i + 2] > 0xE5)
{
pixelData [i + 2] = 0x00;
}
else
{
pixelData [i + 2] = 0xFF;
}
TakingaSnapshot
AfunthingtodoinanyKinectapplicationistocapturepicturesfromthevideocamera.Becauseofthe
gesturalnatureofKinectapplications,peopleareofteninawkwardandfunnypositions.Taking
snapshotsespeciallyworksifyourapplicationissomeformofaugmentedreality.SeveralXboxKinect
gamestakesnapshotsofplayersatdifferentpointsinthegame.Thisprovidesanextrasourceoffun,
34
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
becausethepicturesareviewableafterthegamehasended.Playersaregivenanadditionalformof
entertainmentbylaughingatthemselves.Whatismore,theseimagesaresharable.Yourapplicationcan
uploadtheimagestosocialsitessuchasFacebook,Twitter,orFlickr.Savingframesduringthegameor
providinga“Takeapicture”buttonaddstotheexperienceofallKinectapplications.Thebestpartitis
reallysimpletocode.ThefirststepistoaddabuttontoourXAMLasshowninListing2-8.
Listing2-8.AddaButtontoTakeaSnapshot
<Window x:Class="BeginningKinect.Chapter2.ApplicationFundamentals.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="MainWindow" Height="350" Width="525">
<Grid>
<Image x:Name="VideoStreamElement"/>
<StackPanel HorizontalAlignment="Left" VerticalAlignment="Top">
<Button Content="Take Picture" Click="TakePictureButton_Click"/>
</StackPanel>
</Grid>
</Window>
Inthecode-behindfortheMainWindow,addausingstatementtoreferenceSystem.IO.Amongother
things,thisnamespacecontainsobjectsthatreadandwritefilestotheharddrive.Next,createthe
TakePictureButton_Clickeventhandler,asshowninListing2-9.
Listing2-9TakingaPicture
private void TakePictureButton_Click(object sender, RoutedEventArgs e)
{
string fileName = "snapshot.jpg";
if(File.Exists(fileName))
{
File.Delete(fileName);
}
using(FileStream savedSnapshot = new FileStream(fileName, FileMode.CreateNew))
{
BitmapSource image = (BitmapSource) VideoStreamElement.Source;
JpegBitmapEncoder jpgEncoder = new JpegBitmapEncoder();
jpgEncoder.QualityLevel
= 70;
jpgEncoder.Frames.Add(BitmapFrame.Create(image));
jpgEncoder.Save(savedSnapshot);
}
}
savedSnapshot.Flush();
savedSnapshot.Close();
savedSnapshot.Dispose();
35
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
ThefirstfewlinesofcodeinListing2-9removeanyexistingsnapshots.Thisisjusttomakethesave
processeasy.Therearemorerobustwaysofhandlingsavedfilesormaintainingfiles,butweleave
documentmanagementdetailsforyoutoaddress.Thisisjustasimpleapproachtosavingasnapshot.
Oncetheoldsavedsnapshotisdeleted,weopenaFileStreamtocreateanewfile.The
JpegBitmapEncoderobjecttranslatestheimagefromtheUItoastandardJPEGfile.AftersavingtheJPEG
totheopenFileStreamobject,weflush,close,anddisposeofthefilestream.Theselastthreeactionsare
unnecessarybecausewehavetheFileStreamwrappedintheusingstatement,butwegenerallyliketo
explicitlywritethiscodetoensurethatthefilehandleandotherresourcesarereleased.
Testitout!Runtheapplication,strikeapose,andclickthe“TakePicture”button.Yoursnapshot
shouldbesittinginthesamedirectoryinwhichtheapplicationisrunning.Thelocationwheretheimage
issaveddependsonthevalueofthefilenamevariable,soyouultimatelycontrolwherethefileissaved.
Thisdemonstrationismeanttobesimple,butamorecomplexapplicationwouldlikelyrequiremore
thoughtastowheretosavetheimages.Havefunwiththesnapshotfeature.Gobackthroughtheimage
manipulationsection,andtakesnapshotsofyouandyourfriendswiththedifferentimagetreatments.
Reflectingontheobjects
Tothispoint,wehaveaccomplisheddiscoveringandinitializingaKinectsensor.Wecreatedcolor
imagesfromtheKinect’svideocamera.Letustakeamomenttovisualizetheessenceoftheseclasses,
andhowtheyrelatetoeachother.Thisalsogivesusasecondtoreflectonwhatwe’velearnedsofar,as
wequicklybrushedoveracoupleofimportantobjects.Figure2-1iscalledaclassdiagram.Theclass
diagramisatoolsoftwaredevelopersuseforobjectmodeling.Thepurposeofthisdiagramistoillustrate
theinterconnectednatureofclasses,enumerations,andstructures.Italsoshowsallclassmembers
(fields,methods,andproperties).
36
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
Figure2-1TheColorImageStreamobjectmodel
WeknowfromworkingwiththeColorImageStream(Listing2-2)thatitisapropertyofaKinectSensor
object.Thecolorstream,likeallstreamsontheKinectSensor,mustbeenabledforittoproduceframes
ofdata.TheColorImageStreamhasanoverloadedEnabledmethod.Thedefaultmethodtakesno
parameters,whiletheotherprovidesameansfordefiningtheimageformatofeachframethrougha
singleparameter.Theparameter’stypeisColorImageFormat,whichisanenumeration.Table2-2liststhe
valuesoftheColorImageFormatenumeration.ThedefaultEnabledmethodoftheColorImageStreamsets
theimageformattoRGBwitharesolutionof640x480at30framespersecond.Onceenabled,theimage
formatisavailablebywayoftheFormatproperty.
37
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
Table2-2ColorImageStreamFormats
ColorImageFormat
What it means
Undefined
Theimageresolutionisindeterminate.
RgbResolution640x480Fps30
Theimageresolutionis640x480,pixeldataisRGB32format
at30framespersecond.
RgbResolution1280x960Fps12
Theimageresolutionis1280x960,pixeldataisRGB32
formatat12framespersecond.
YuvResolution640x480Fps15
Theimageresolutionis640x480,pixeldataisYUVformatat
15framespersecond.
RawYuvResolution640x480Fps15
Theimageresolutionis640x480,pixeldataisrawYUV
formatat15framespersecond.
TheColorImageStreamhasfivepropertiesthatprovidemeasurementsofthecamera’sfieldofview.
ThepropertiesallbeginwithNominal,becausethevaluesscaletotheresolutionsetwhenthestreamis
enabled.Someapplicationsneedtoperformcalculationsbasedonthecamera’sopticpropertiessuchas
fieldofviewandfocallength.ThepropertiesontheColorImageStreamallowdeveloperstocodetothe
properties,makingtheapplicationsmorerobusttoresolutionchangesorfuturehardwareupdates,
whichprovidegreaterqualityimageresolutions.InChapter3,wedemonstrateusesfortheseproperties,
butusingtheDepthImageStream.
TheImageStreamclassisthebasefortheColorImageStream(theDepthImageStream,too).Assuch,the
ColorImageStreaminheritsthefourpropertiesthatdescribethepixeldatageneratedbyeachframe
producedbythestream.WeusedthesepropertiesinListing2-2tocreateaWriteableBitmapobject.The
valuesofthesepropertiesdependontheColorImageFormatspecifiedwhenthestreamisenabled.
TocompletethecoverageoftheImageStreamclass,theclassdefinesapropertyandmethod,not
discusseduptonow,namedIsEnabledandDisable,respectively.TheIsEnabledpropertyisread-only.It
returnstrueafterthestreamisenabledandfalseafteracalltotheDisablemethod.TheDisable
methoddeactivatesthestream.Frameproductionceasesand,asaresult,theColorFrameReadyeventon
theKinectSensorobjectstopsfiring.
Whenenabled,theColorImageStreamproducesColorImageFrameobjects.TheColorImageFrame
objectissimple.IthasapropertynamedFormat,whichistheColorImageFormatvalueoftheparent
stream.Ithasasingle,non-inheritedmethod,CopyPixelDataTo,whichcopiesimagepixeldatatoa
specifiedbytearray.Theread-onlyPixelDataLengthpropertydefinesthesizeofthearray.Thevalueof
thePixelDataLengthpropertyiscalculatedbymultiplyingthevaluesoftheWidth,Height,and
BytesPerPixelproperties.ThesepropertiesareallinheritedfromtheabstractclassImageFrame.
38
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
BYTES PER PIXEL
ThestreamFormatdeterminesthepixelformatandthereforethemeaningofthebytes.Ifthestreamis
enabledusingformatColorImageFormat.RgbResolution640x480Fps30,thepixelformatisBgr32.This
meansthatthereare32bits(4bytes)perpixel.Thefirstbyteisthebluechannelvalue,thesecondis
green,andthethirdisred.Thefourthbyteisunused.TheBgra32pixelformatisalsovalidtousewhen
workingwithanRGBresolution(RgbResolution640x480Fps30 orRgbResolution1280x960Fps12).Inthe
Bgra32pixelformat,thefourthbytedeterminesthealphaoropacityofthepixel.
Iftheimagesizeis640x480,thenthebytearraywillhave122880bytes(height*width*BytesPerPixel=
640*480*4).
Asasidenote,whenworkingwithimages,youwillseethetermstride.Thestrideisthenumberofbytes
forarowofpixels.Multiplyingtheimagewidthbythebytesperpixelcalculatesthestride.
Inadditiontohavingpropertiesthatdescribethepixeldata,theColorImageFrameobjecthasa
coupleofpropertiestodescribetheframeitself.Eachframeproducedbythestreamisnumbered.The
framenumberincreasessequentiallyastimeprogresses.Anapplicationshouldnotexpecttheframe
numbertoalwaysbeincrementallyonegreaterthanthepreviousframe,butratheronlytobegreater
thanthepreviousframe.Thisisbecauseitisalmostimpossibleforanapplicationtonotskipover
framesduringnormalexecution.Theotherproperty,whichdescribestheframe,isTimestamp.The
TimestampisthenumberofmillisecondssincetheKinectSensorwasstarted(thatis,theStartmethod
wascalled).TheTimestampvalueresetstozeroeachtimetheKinectSensorstarts.
DataRetrieval:EventsandPolling
TheprojectsthroughoutthischapterrelyoneventsfromtheKinectSensorobjecttodeliverframedata
forprocessing.EventsareprolificinWPFandwellunderstoodbydevelopersastheprimarymeansof
notifyingapplicationcodewhendataorstatechangesoccur.FormostKinect-basedapplications,the
eventmodelissufficient;however,itisnottheonlymeansofretrievingframedatafromastream.An
applicationcanmanuallypollaKinectdatastreamforanewframeofdata.
Pollingisasimpleprocessbywhichanapplicationmanuallyrequestsaframeofdatafroma
stream.EachKinectdatastreamhasamethodnamedOpenNextFrame.WhencallingtheOpenNextFrame
method,theapplicationspecifiesatimeoutvalue,whichistheamountoftimetheapplicationiswilling
towaitforanewframe.Thetimeoutismeasuredinmilliseconds.Themethodattemptstoretrieveanew
frameofdatafromthesensorbeforethetimeoutexpires.Ifthetimeoutexpires,themethodreturnsa
nullframe.
Whenusingtheeventmodel,theapplicationsubscribestothestream’sframe-readyevent.Each
timetheeventfires,theeventhandlercallsamethodontheeventargumentstogetthenextframe.For
example,whenworkingwiththecolorstream,theeventhandlercallstheOpenColorImageFrameonthe
ColorImageFrameReadyEventArgstogettheColorImageFrameobject.Theeventhandlershouldalwaystest
theframeforanullvalue,becausetherearecircumstanceswhentheeventfires,butthereisnoavailable
frame.Beyondthat,theeventmodelrequiresnoextrasanitychecksorexceptionhandling.
Bycontrast,theOpenNextFramemethodhasthreeconditionsunderwhichitraisesan
InvalidOperationExceptionexception.AnapplicationcanexpectanexceptionwhentheKinectSensoris
notrunning,whenthestreamisnotenabled,orwhenusingeventstoretrieveframedata.Anapplication
39
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
mustchooseeithereventsorpolling,butcannotuseboth.However,anapplicationcanuseeventsfor
onestream,butpollingonanother.Forexample,anapplicationcouldsubscribetotheColorFrameReady
event,butpolltheSkeletonStreamforframedata.TheonecaveatistheAllFramesReadyeventcoversall
streams—meaningthatifanapplicationsubscribestotheAllFramesReadyevent,anyattempttopollany
ofthestreamsresultsinanInvalidOperationException.
Beforedemonstratinghowtocodethepollingmodel,itisgoodtounderstandthecircumstances
underwhichanapplicationwouldrequirepolling.Themostbasicreasonforusingpollingis
performance.Pollingremovestheinnateoverheadassociatedwithevents.Whenanapplicationtakeson
frameretrieval,itisrewardedwithperformancegains.Thedrawbackispollingismorecomplicatedto
implementthanusingevents.
Besidesperformance,theapplicationtypecannecessitateusingpolling.WhenusingtheKinectfor
WindowsSDK,anapplicationisnotrequiredtouseWPF.TheSDKalsoworkswithXNA,whichhasa
differentapplicationloopthanWPFandisnoteventdriven.Iftheneedsoftheapplicationdictateusing
XNA,forexamplewhenbuildinggames,youhavetousepolling.UsingtheKinectforWindowsSDK,itis
alsopossibletocreateconsoleapplications,whichhavenouserinterfaceatall.Imaginecreatingarobot
thatusesKinectforeyes.Theapplicationthatdrivesthefunctionsoftherobotdoesnotneedauser
interface.Itcontinuallypullsandprocessestheframedatatoprovideinputtotherobot.Inthisusecase,
pollingisthebestoption.
Ascoolasitwouldbeforthisbooktoprovideyouwithaprojectthatbuildsarobottodemonstrate
polling,itisimpossible.Instead,wemodestlyrecreatethepreviousproject,whichdisplayscolorimage
frames.Tobegin,createanewproject,addreferencestoMicrosoft.Kinect.dll,andupdatethe
MainWindow.xamltoincludeanImageelementnamedColorImageElement.Listing2-10showsthebase
codeforMainWindow.
Listing2-10PreparingtheFoundationforaPollingApplication
#region Member Variables
private KinectSensor _Kinect;
private WriteableBitmap _ColorImageBitmap;
private Int32Rect _ColorImageBitmapRect;
private int _ColorImageStride;
private byte[] _ColorImagePixelData;
#endregion Member Variables
#region Constructor
public MainWindow()
{
InitializeComponent();
CompositionTarget.Rendering += CompositionTarget_Rendering;
}
#endregion Constructor
#region Methods
private void CompositionTarget_Rendering(object sender, EventArgs e)
{
DiscoverKinectSensor();
PollColorImageStream();
}
#endregion Methods
40
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
ThemembervariabledeclarationsinListing2-10arethesameaspreviousprojectsinthischapter.
Pollingchangeshowanapplicationretrievesdata,butmanyoftheotheraspectsofworkingwiththe
dataarethesameaswhenusingtheeventmodel.Anypolling-basedprojectneedstodiscoverand
initializeaKinectSensorobjectjustasanevent-basedapplicationdoes.Thisprojectalsousesa
WriteableBitmaptocreateframeimages.Theprimarydifferenceisintheconstructorwesubscribetothe
RenderingeventontheCompositionTargetobject.But,wait!
WhatistheCompositionTargetobject?
WhatcausestheRenderingeventtofire?
Aren’twetechnicallystillusinganeventmodel?
Theseareallvalidquestionstoask.TheCompositionTargetobjectisarepresentationofthe
drawablesurfaceofanapplication.TheRenderingeventfiresonceperrenderingcycle.TopollKinectfor
newframedata,weneedaloop.Therearetwowaystocreatethisloop.Onemethodistouseathread,
whichwewilldoinournextproject,butanothermoresimpletechniqueistouseabuilt-inloop.The
RenderingeventoftheCompositionTargetprovidesaloopwiththeleastamountofwork.Usingthe
CompositionTargetissimilartothegamingloopofagamingenginesuchasXNA.Thereisonedrawback
withusingtheCompositionTarget.Anylong-runningprocessintheRenderingeventhandlercancause
performanceissuesontheUI,becausetheeventhandlerexecutesonthemainUIthread.Withthisin
mind,donotattempttodotoomuchworkinthiseventhandler.
ThecodewithintheRenderingeventhandlerneedstodofourtasks.Itmustdiscoveraconnected
KinectSensor,initializethesensor,responsetoanystatuschangeinthesensor,andultimatelypollfor
andprocessanewframeofdata.Webreakthesefourtasksintotwomethods.ThecodeinListing2-11
performsthefirstthreetasks.ThiscodeisalmostidenticaltocodewepreviouswroteinListing2-2.
41
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
Listing2-11DiscoverandInitializeaKinectSensorforPolling
private void DiscoverKinectSensor()
{
if(this._Kinect != null && this._Kinect.Status != KinectStatus.Connected)
{
//If the sensor is no longer connected, we need to discover a new one.
this._Kinect = null;
}
if(this._Kinect == null)
{
//Find the first connected sensor
this._Kinect = KinectSensor.KinectSensors
.FirstOrDefault(x => x.Status == KinectStatus.Connected);
if(this._Kinect != null)
{
//Initialize the found sensor
this._Kinect.ColorStream.Enable();
this._Kinect.Start();
ColorImageStream colorStream
this._ColorImageBitmap
this._ColorImageBitmapRect
this._ColorImageStride
this.ColorImageElement.Source
this._ColorImagePixelData
}
}
= this._Kinect.ColorStream;
= new WriteableBitmap(colorStream.FrameWidth,
colorStream.FrameHeight,
96, 96, PixelFormats.Bgr32,
null);
= new Int32Rect(0, 0, colorStream.FrameWidth,
colorStream.FrameHeight);
= colorStream.FrameWidth *
colorStream.FrameBytesPerPixel;
= this._ColorImageBitmap;
= new byte[colorStream.FramePixelDataLength];
}
Listing2-12providesthecodeforthePollColorImageStreammethod.ReferbacktoListing2-6and
noticethatthecodeisvirtuallythesame.InListing2-12,wetesttoensurewehaveasensorwithwhich
towork,andwecalltheOpenNextFramemethodtogetacolorimageframe.Thecodetogettheframeand
updatetheWriteableBitmapiswrappedinatry-catchstatement,becauseofthepossibilitiesofthe
OpenNextFramemethodcallthrowinganexception.The100millisecondstimeoutdurationpassedtothe
OpenNextFramecallisfairlyarbitrary.Awell-chosentimeoutensurestheapplicationcontinuestooperate
smoothlyevenifaframeortwoareskipped.Youwillalsowantyourapplicationtomaintainascloseto
30framespersecondaspossible.
42
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
Listing2-12PollingForaFrameofStreamData
private void PollColorImageStream()
{
if(this._Kinect == null)
{
//Display a message to plug-in a Kinect.
}
else
{
try
{
using(ColorImageFrame frame = this._Kinect.ColorStream.OpenNextFrame(100))
{
if(frame != null)
{
frame.CopyPixelDataTo(this._ColorImagePixelData);
this._ColorImageBitmap.WritePixels(this._ColorImageBitmapRect,
this._ColorImagePixelData,
this._ColorImageStride, 0);
}
}
}
catch(Exception ex)
{
//Report an error message
}
}
}
Nowruntheproject.Overall,thepollingmodelshouldperformbetterthantheeventmodel,
although,duetothesimplicityoftheseexamples,anychangeinperformancemaybemarginalatbest.
Thisexampleofusingpolling,however,continuestosufferfromanotherproblem.Byusingthe
CompositionTargetobject,theapplicationremainstiedtoWPF’sUIthread.Anylong-runningdata
processingorpoorlychosentimeoutfortheOpenNextFramemethodcancauseslow,choppy,or
unresponsivebehaviorintheapplication,becauseitexecutesontheUIthread.Thesolutionistoforka
newthreadandimplementallpollinganddataprocessingonthesecondarythread.
ForthecontrastbetweenpollingusingtheCompositionTargetandabackgroundthread,createa
newWPFprojectinVisualStudio.In.NET,workingwiththreadsismadeeasywiththeBackgroundWorker
class.UsingaBackgroundWorkerobjectdevelopersdonothavetoconcernthemselveswiththetedious
workofmanagingthethread.Startingandstoppingthethreadbecomestrivial.Developersareonly
responsibleforwritingthecodethatexecutesonthethread.TousetheBackgroundWorkerclass,adda
referencetoSystem.ComponentModeltothesetofusingstatementsinMainWindow.xaml.cs.When
finished,addthecodefromListing2-13.
43
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
Listing2-13PollingonaSeparateThread
#region Member Variables
private KinectSensor _Kinect;
private WriteableBitmap _ColorImageBitmap;
private Int32Rect _ColorImageBitmapRect;
private int _ColorImageStride;
private byte[] _ColorImagePixelData;
private BackgroundWorker _Worker;
#endregion Member Variables
public MainWindow()
{
InitializeComponent();
this._Worker = new BackgroundWorker();
this._Worker.DoWork += Worker_DoWork;
this._Worker.RunWorkerAsync();
this.Unloaded += (s, e) => { this._Worker.CancelAsync(); };
}
private void Worker_DoWork(object sender, DoWorkEventArgs e)
{
BackgroundWorker worker = sender as BackgroundWorker;
}
if(worker != null)
{
while(!worker.CancellationPending)
{
DiscoverKinectSensor();
PollColorImageStream();
}
}
First,noticethatthesetofmembervariablesinthisprojectarethesameasinthepreviousproject
(Listing2-10)withtheadditionoftheBackgroundWorkervariable_Worker.Intheconstructor,wecreate
aninstanceoftheBackgroundWorkerclass,subscribetotheDoWorkevent,andstartthenewthread.The
lastlineofcodeintheconstructorcreatesananonymouseventhandlertostoptheBackgroundWorker
whenthewindowcloses.AlsoincludedinListing2-13isthecodefortheDoWorkeventhandler.
TheDoWorkeventfireswhenthethreadstarts.Theeventhandlerloopsuntilitisnotifiedthatthe
threadhasbeencancelled.Duringeachpassintheloop,itcallstheDiscoverKinectSensorand
PollColorImageStreammethods.Copythecodeforthesemethodsfromthepreviousproject(Listing2-11
andListing2-12),butdonotattempttoruntheapplication.Ifyoudo,youquicklynoticetheapplication
failswithanInvalidOperationExceptionexception.Theerrormessagereads,“Thecallingthreadcannot
accessthisobjectbecauseadifferentthreadownsit.”
44
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
Thepollingofdataoccursonabackgroundthread,butwewanttoupdateUIelements,whichexist
onanotherthread.Thisisthejoyofworkingacrossthreads.ToupdatetheUIelementsfromthe
backgroundthread,weusetheDispatcherobject.EachUIelementinWPFhasaDispatcher,whichis
usedtoexecuteunitsofworkonthesamethreadastheUIelement.Listing2-14containstheupdated
versionsoftheDiscoverKinectSensorandPollColorImageStreammethods.Thechangesnecessaryto
updatetheUIthreadareshowninbold.
Listing2-14UpdatingtheUIThread
private void DiscoverKinectSensor()
{
if(this._Kinect != null && this._Kinect.Status != KinectStatus.Connected)
{
this._Kinect = null;
}
if(this._Kinect == null)
{
this._Kinect = KinectSensor.KinectSensors
.FirstOrDefault(x => x.Status == KinectStatus.Connected);
if(this._Kinect != null)
{
this._Kinect.ColorStream.Enable();
this._Kinect.Start();
ColorImageStream colorStream
= this._Kinect.ColorStream;
this.ColorImageElement.Dispatcher.BeginInvoke(new Action(() =>
{
this._ColorImageBitmap
= new WriteableBitmap(colorStream.FrameWidth,
colorStream.FrameHeight,
96, 96, PixelFormats.Bgr32,
null);
this._ColorImageBitmapRect = new Int32Rect(0, 0, colorStream.FrameWidth,
colorStream.FrameHeight);
this._ColorImageStride
= colorStream.FrameWidth *
colorStream.FrameBytesPerPixel;
this._ColorImagePixelData = new byte[colorStream.FramePixelDataLength];
this.ColorImageElement.Source = this._ColorImageBitmap;
}
}
}));
}
private void PollColorImageStream()
{
if(this._Kinect == null)
{
45
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
//Notify that there are no available sensors.
}
else
{
try
{
using(ColorImageFrame frame = this._Kinect.ColorStream.OpenNextFrame(100))
{
if(frame != null)
{
frame.CopyPixelDataTo(this._ColorImagePixelData);
this.ColorImageElement.Dispatcher.BeginInvoke(new Action(() =>
{
this._ColorImageBitmap.WritePixels(this._ColorImageBitmapRect,
this._ColorImagePixelData,
this._ColorImageStride, 0);
}));
}
}
}
}
catch(Exception ex)
{
//Report an error message
}
}
Withthesecodeadditions,younowhavetwofunctionalexamplesofpolling.Neitherofthese
pollingexamplesarecompleterobustapplications.Infact,bothrequiresomeworktoproperlymanage
andcleanupresources.Forexample,neitheruninitializedtheKinectSensor.Suchtasksremainyour
responsibilitywhenbuildingreal-worldKinect-basedapplications.
Theseexamplesneverthelessprovidethefoundationfromwhichyoucanbuildanyapplication
drivenbyapollingarchitecture.Pollinghasdistinctadvantagesovertheeventmodel,butatthecostof
additionalworkforthedeveloperandpotentialcomplexitytotheapplicationcode.Inmost
applications,theeventmodelisadequateandshouldbeusedinsteadofpolling;theprimaryexception
fornotusingtheeventmodeliswhenyourapplicationisnotwrittenforWPF.Forexample,anyconsole,
XNA,orotherapplicationusingacustomapplicationloopshouldemploythepollingarchitecture
model.ItisrecommendedthatallWPF-basedapplicationsinitiallyusetheframe-readyeventsonthe
KinectSensorandonlytransitiontopollingifperformanceconcernswarrantpolling.
Summary
Insoftwaredevelopment,patternspersisteverywhere.Everyapplicationofacertaintypeorcategoryis
fundamentallythesameandcontainssimilarstructuresandarchitectures.Inthischapter,wepresented
corecodeforbuildingKinect-drivenapplications.EachKinectapplicationyoudevelopwillcontainthe
samelinesofcodetodiscover,initialize,anduninitializetheKinectsensor.
WeexploredtheprocessofworkingwithframedatageneratedbytheColorImageStream.
Additionally,wedovedeeperandstudiedtheColorImageStream’sbaseImageStreamclassaswellasthe
46
www.it-ebooks.info
CHAPTER2APPLICATIONFUNDAMENTALS
ImageFrameclass.TheImageStreamandImageFramearealsothebaseclassesfortheDepthImageStreamand
DepthFrameclasses,whichweintroduceinthenextchapter.
ThemechanismstoretrieverawdatafromKinect’sdatastreamsarethesameregardlessofwhich
streamyouuse.Architecturallyspeaking,allKinectapplicationsuseeithereventsorpollingtoretrieve
framedatafromaKinectdatastream.Theeasiesttouseistheeventmodel.Thisisalsothedefacto
architecture.Whenusingtheeventmodel,theKinectSDKdoestheworkofpollingstreamdataforyou.
However,iftheneedsoftheapplicationdictate,theKinectSDKallowsdeveloperstopolldatafromthe
Kinectmanually.
Thesearethefundamentals.AreyoureadytoseetheKinectSDKingreaterdepth?
47
www.it-ebooks.info
CHAPTER 3
Depth Image Processing
Theproductionofthree-dimensionaldataistheprimaryfunctionofKinect.Itisuptoyoutocreate
excitingexperienceswiththedata.ApreconditiontobuildingaKinectapplicationishavingan
understandingoftheoutputofthehardware.Beyondsimplyunderstanding,theintrinsicmeaningof
the1’sand0’sisacomprehensionofitsexistentialsignificance.Image-processingtechniquesexist
todaythatdetecttheshapesandcontoursofobjectswithinanimage.TheKinectSDKusesimage
processingtotrackusermovementsintheskeletontrackingengine.Depthimageprocessingcanalso
detectnon-humanobjectssuchasachairorcoffeecup.Therearenumerouscommerciallabsand
universitiesactivelystudyingtechniquestoperformthislevelofobjectdetectionfromdepthimages.
Therearemanydifferentusesandfieldsofstudyarounddepthinputthatitwouldbeimpossibleto
coverthemorcoveranyonetopicwithconsiderableprofundityinthisbookmuchlessasinglechapter.
Thegoalofthischapteristodetailthedepthdatadowntothemeaningofeachbit,andintroduceyouto
thepossibleimpactthataddingjustoneadditionaldimensioncanhaveonanapplication.Inthis
chapter,wediscusssomebasicconceptsofdepthimageprocessing,andsimpletechniquesforusing
thisdatainyourapplications.
SeeingThroughtheEyesoftheKinect
Kinectisdifferentfromallotherinputdevices,becauseitprovidesathirddimension.Itdoesthisusing
aninfraredemitterandcamera.UnlikeotherKinectSDKssuchasOpenNI,orlibfreenect,theMicrosoft
SDKdoesnotproviderawaccesstotheIRstream.Instead,theKinectSDKprocessestheIRdata
returnedbytheinfraredcameratoproduceadepthimage.Depthimagedatacomesfroma
DepthImageFrame,whichisproducedbytheDepthImageStream.
WorkingwiththeDepthImageStreamissimilartotheColorImageStream.TheDepthImageStreamand
ColorImageStreambothsharethesameparentclassImageStream.Wecreateimagesfromaframeofdepth
datajustaswedidwiththecolorstreamdata.Begintoseethedepthstreamimagesbyfollowingthese
steps,whichbynowshouldlookfamiliar.Theyarethesameasfromthepreviouschapterwherewe
workedwiththecolorstream.
1.
CreateanewWPFApplicationproject.
2.
AddareferencetoMicrosoft.Kinect.dll.
3.
AddanImageelementtoMainWindow.xamlandnameit“DepthImage”.
4.
AddthenecessarycodetodetectandinitializeaKinectSensorobject.Referto
Chapter2asneeded.
5.
UpdatethecodethatinitializestheKinectSensorobjectsothatitmatches
Listing3-1.
49
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-1.InitializingtheDepthStream
this._KinectDevice.DepthStream.Enable();
this._KinectDevice.DepthFrameReady += KinectDevice_DepthFrameReady;
6.
AddtheDepthFrameReadyeventhandlercode,asshowninListing3-2.For
thesakeofbeingbriefwiththecodelisting,wearenotusingthe
WriteableBitmaptocreatedepthimages.Weleavethisasarefactoring
exerciseforyoutoundertake.RefertoListing2-5ofChapter2asneeded.
Listing3-2.DepthFrameReadyEventHandler
using(DepthImageFrame frame = e.OpenDepthImageFrame())
{
if(frame != null)
{
short[] pixelData = new short[frame.PixelDataLength];
frame.CopyPixelDataTo(pixelData);
int stride
= frame.Width * frame.BytesPerPixel;
DepthImage.Source = BitmapSource.Create(frame.Width, frame.Height, 96, 96,
PixelFormats.Gray16, null,
pixelData, stride);
}
}
7.
Runtheapplication!
WhenKinecthasanewdepthimageframeavailableforprocessing,theKinectSensorfiresthe
DepthFrameReadyevent.Oureventhandlersimplytakestheimagedataandcreatesabitmap,whichis
thendisplayedintheUIwindow.ThescreenshotinFigure3-1isanexampleofthedepthstreamimage.
ObjectsnearKinectareadarkshadeofgrayorblack.ThefartheranobjectisfromKinect,thelighterthe
gray.
50
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-1.Rawdepthimageframe
MeasuringDepth
TheIRordepthcamerahasafieldofviewjustlikeanyothercamera.ThefieldofviewofKinectis
limited,asillustratedinFigure3-2.TheoriginalpurposeofKinectwastoplayvideogameswithinthe
confinesofgameroomorlivingroomspace.Kinect’snormaldepthvisionrangesfromaroundtwoanda
halffeet(800mm)tojustover13feet(4000mm).However,arecommendedusagerangeis3feetto12
feetasthereliabilityofthedepthvaluesdegradeattheedgesofthefieldofview.
51
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-2.Kinectfieldofview
Likeanycamera,thefieldofviewofthedepthcameraispyramidshaped.Objectsfartherawayfrom
thecamerahaveagreaterlateralrangethanobjectsnearertoKinect.Thismeansthatheightandwidth
pixeldimensions,suchas640480,donotcorrespondwithaphysicallocationinthecamera’sfieldof
view.Thedepthvalueofeachpixel,however,doesmaptoaphysicaldistanceinthefieldofview.Each
pixelrepresentedinadepthframeis16bits,makingtheBytesPerPixelpropertyofeachframeavalueof
two.Thedepthvalueofeachpixelisonly13ofthe16bits,asshowninFigure3-3.
Figure3-3.Layoutofthedepthbits
Gettingthedistanceofeachpixeliseasy,butnotobvious.Itrequiressomebitmanipulation,which
sadlyasdeveloperswedonotgettodomuchofthesedays.Itisquitepossiblethatsomedevelopers
haveneverusedorevenheardofbitwiseoperators.Thisisunfortunate,becausebitmanipulationisfun
andcanbeanart.Atthispoint,youarelikelythinkingsomethingtotheeffectof,“thisguy‘sanerd”and
52
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
you’dberight.Includedintheappendixismoreinstructionandexamplesofusingbitmanipulationand
math.
AsFigure3-3,shows,thedepthvalueisstoredinbits3to15.Togetthedepthvalueintoaworkable
form,wehavetoshiftthebitstotherighttoremovetheplayerindexbits.Wediscussthesignificanceof
theplayerindexbitslater.Listing3-3showssamplecodetogetthedepthvalueofapixel.Referto
AppendixAforathoroughexplanationofthebitwiseoperationsusedinListing3-3.Inthelisting,the
pixelDatavariableisassumedanarrayofshortvaluesoriginatingfromthedepthframe.ThepixelIndex
variableiscalculatedbasedonthepositionofthedesiredpixel.TheKinectforWindowsSDKdefinesa
constantontheDepthImageFrameclass,whichspecifiesthenumberofbitstoshiftrighttogetthedepth
valuenamedPlayerIndexBitmaskWidth.ApplicationsshouldusethisconstantinsteadofusinghardcodedliteralsasthenumberofbitsreservedforplayersmayincreaseinfuturereleasesofKinect
hardwareandtheSDK.
Listing3-3.BitManipulationtoGetDepth
int pixelIndex = pixelX + (pixelY * frame.Width);
int depth = pixelData[pixelIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth;
Aneasywaytoseethedepthdataistodisplaytheactualnumbers.Letusupdateourcodetooutput
thedepthvalueofapixelataparticularlocation.Thisdemonstrationusesthepositionofthemouse
pointerwhenthemouseisclickedonthedepthimage.Thefirststepistocreateaplacetodisplaythe
depthvalue.UpdatetheMainWindow.xaml,tolooklikeListing3-4.
Listing3-4.NewTextBlocktoDisplayDepthValues
<Window x:Class=" BeginningKinect.Chapter3.DepthImage.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="MainWindow" Height="600" Width="800">
<Grid>
<StackPanel>
<TextBlock x:Name="PixelDepth" FontSize="48" HorizontalAlignment="Left"/>
<Image x:Name="DepthImage" Width="640" Height="480"/>
</StackPanel>
</Grid>
</Window>
Listing3-5showsthecodeforthemouse-upeventhandler.Beforeaddingthiscode,therearea
coupleofchangestonote.ThecodeinListing3-5assumestheprojecthasbeenrefactoredtousea
WriteableBitmap.Thecodechangesspecifictothisdemonstrationstartbycreatingaprivatemember
variablenamed_LastDepthFrame.IntheKinectDevice_DepthFrameReadyeventhandler,setthevalueof
the_LastDepthFramemembervariabletothecurrentframeeachtimetheDepthFrameReadyeventfires.
Becauseweneedtokeepareferencetothelastdepthframe,theeventhandlercodedoesnot
immediatelydisposeoftheframeobject.Next,subscribetotheMouseLeftButtonUpeventonthe
DepthFrameimageobject.Whentheuserclicksthedepthimage,theDepthImage_MouseLeftButtonUp
eventhandlerexecutes,whichlocatesthecorrectpixelbythemousecoordinates.Thelaststepisto
displaythevalueintheTextBlocknamedPixelDepthcreatedinListing3-4.
53
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-5.ResponsetoaMouseClick
private void KinectDevice_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
if(this._LastDepthFrame != null)
{
this._LastDepthFrame.Dispose();
this._LastDepthFrame = null;
}
this._LastDepthFrame = e.OpenDepthImageFrame();
if(this._LastDepthFrame != null)
{
this._LastDepthFrame.CopyPixelDataTo(this._DepthImagePixelData);
this._RawDepthImage.WritePixels(this._RawDepthImageRect, this._DepthImagePixelData,
this._RawDepthImageStride, 0);
}
}
private void DepthImage_MouseLeftButtonUp(object sender, MouseButtonEventArgs e)
{
Point p = e.GetPosition(DepthImage);
if(this._DepthImagePixelData != null && this._DepthImagePixelData.Length > 0)
{
int pixelIndex
= (int) (p.X + ((int) p.Y * this._LastDepthFrame.Width));
int depth
= this._DepthImagePixelData[pixelIndex] >>
DepthImageFrame.PlayerIndexBitmaskWidth;
int depthInches
= (int) (depth * 0.0393700787);
int depthFt
= depthInches / 12;
depthInches
= depthInches % 12;
PixelDepth.Text = string.Format("{0}mm ~ {1}'{2}\"", depth, depthFt, depthInches);
}
}
Itisimportanttopointoutafewparticularswiththiscode.NoticethattheWidthandHeight
propertiesoftheImageelementarehard-coded(Listing3-4).Ifthesevaluesarenothard-coded,thenthe
Imageelementnaturallyscaleswiththesizeofitsparentcontainer.IftheImageelement’sdimensions
weretobesizeddifferentlyfromthedepthframedimensions,thiscodereturnsincorrectdataormore
likelythrowanexceptionwhentheimageisclicked.Thepixelarrayintheframeisafixedsizebasedon
theDepthImageFormatvaluegiventotheEnabledmethodoftheDepthImageStream.Notsettingtheimage
sizemeansthatitwillscalewiththesizeofitsparentcontainer,which,inthiscase,istheapplication
window.Ifyoulettheimagescaleautomatically,youthenhavetoperformextracalculationstotranslate
themousepositiontothedepthframedimensions.Thistypeofscalingexerciseisactuallyquite
common,aswewillseelaterinthischapterandthechaptersthatfollow,butherewekeepitsimpleand
hard-codetheoutputimagesize.
54
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Wecalculatethepixellocationwithinthebytearrayusingthepositionofthemousewithinthe
imageandthesizeoftheimage.Withthepixel’sstartingbytelocated,convertthedepthvalueusingthe
logicfromListing3-3.Forcompleteness,wedisplaythedepthinfeetandinchesinadditionto
millimeters.Allofthelocalvariablesonlyexisttomakethecodemorereadableonthesepagesanddo
notmateriallyaffecttheexecutionofthecode.
Figure3-4showstheoutputproducedbythecode.Thedepthframeimagedisplaysonthescreen
andprovidesapointofreferencefortheusertotarget.Inthisscreenshot,themouseispositionedover
thepalmoftheuser’shand.Onmouseclick,thepositionofthemousecursorisusedtofindthedepth
valueofthepixelatthatposition.Withthepixellocated,itiseasytoextractthedepthvalue.
Figure3-4.Displayingthedepthvalueforapixel
NoteAdepthvalueofzeromeansthattheKinectwasunabletodeterminethedepthofthepixel.When
processingdepthdata,treatzerodepthvaluesasaspecialcase;inmostinstances,youwilldisregardthem.
ExpectazerodepthvalueforanypixelwherethereisanobjecttooclosetotheKinect.
55
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
EnhancedDepthImages
Beforegoinganyfurther,weneedtoaddressthelookofthedepthimage.Itisnaturallydifficulttosee.
Theshadesofgrayfallonthedarkerendofthespectrum.Infact,theimagesinFigures3-1and3-4had
tobealteredwithanimage-editingtooltobeprintableinthebook!Inthenextsetofexercises,we
manipulatetheimagebitsjustaswedidinthepreviouschapter.However,therewillbeafew
differences,becauseasweknow,thedataforeachpixelisdifferent.Followingthat,weexaminehowwe
cancolorizethedepthimagestoprovideevengreaterdepthresolution.
BetterShadesofGray
Theeasiestwaytoimprovetheappearanceofthedepthimageistoinvertthebits.Thecolorofeach
pixelisbasedonthedepthvalue,whichstartsfromzero.Inthedigitalcolorspectrum,blackis0and
65536(16-bitgrayscale)iswhite.Thismeansthatmostdepthsfallintothedarkerendofthespectrum.
Additionally,donotforgetthatallundeterminabledepthsaresettozero.Invertingorcomplementing
thebitsshiftsthebiastowardsthelighterendofthespectrum.Adepthofzeroisnowwhite.
WekeeptheoriginaldepthimageintheUIforcomparisonwiththeenhanceddepthimage.Update
MainWindow.xamltoincludeanewStackPanelandImageelement,asshowninListing3-6.Noticethe
adjustmenttothewindow’ssizetoensurethatbothimagesarevisiblewithouthavingtoresizethe
window.
Listing3-6.UpdatedUIforNewDepthImage
<Window x:Class=" BeginningKinect.Chapter3.DepthImage.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="MainWindow" Height="600" Width="1280">
<Grid>
<StackPanel>
<TextBlock x:Name="PixelDepth" FontSize="48" HorizontalAlignment="Left"/>
<StackPanel Orientation="Horizontal">
<Image x:Name="DepthImage" Width="640" Height="480"/>
<Image x:Name="EnhancedDepthImage" Width="640" Height="480"/>
</StackPanel>
</StackPanel>
</Grid>
</Window>
Listing3-7showsthecodetoflipthedepthbitstocreateabetterdepthimage.Addthismethodto
yourprojectcode,andcallitfromtheKinectDevice_DepthFrameReadyeventhandler.Thesimple
functionofthiscodeistocreateanewbytearray,anddoabitwisecomplementofthebits.Also,notice
thismethodfiltersoutsomebitsbydistance.Becauseweknowdepthdatabecomesinaccurateatthe
edgesofthedepthrange,wesetthepixelsoutsideofourthresholdrangetoblack.Inthisexample,any
pixelgreaterthan10feetandcloserthan4feetiswhite(0xFF).
56
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-7.ALightShadeofGrayDepthImage
private void CreateLighterShadesOfGray(DepthImageFrame depthFrame , short[] pixelData)
{
int depth;
int loThreshold
= 1220;
int hiThreshold
= 3048;
short[] enhPixelData
= new short[depthFrame.Width * depthFrame.Height];
for(int i = 0; i < pixelData.Length; i++)
{
depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
}
}
if(depth < loThreshold || depth > hiThreshold)
{
enhPixelData [i] = 0xFF;
}
else
{
enhPixelData [i] = (short) ~pixelData[i];
}
EnhancedDepthImage.Source = BitmapSource.Create(depthFrame.Width, depthFrame.Height,
96, 96, PixelFormats.Gray16, null,
enhPixelData,
depthFrame.Width *
depthFrame.BytesPerPixel);
Notethataseparatemethodisdoingtheimagemanipulation,whereasuptonowallframe
processinghasbeenperformedintheeventhandlers.Eventhandlersshouldcontainaslittlecodeas
possibleandshoulddelegatetheworktoothermethods.Theremaybeinstances,mostlydrivenby
performanceconsiderations,wheretheprocessingworkwillhavetobedoneinaseparatethread.
Havingthecodebrokenoutintomethodslikethismakesthesetypesofchangeseasyandpainless.
Figure3-5showstheapplicationoutput.Thetwodepthimagesareshownsidebysideforcontrast.
Theimageontheleftisthenaturaldepthimageoutput,whiletheimageontherightisproducedbythe
codeinListing3-7.Noticethedistinctinversionofgrays.
57
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-5.Lightershadesofgray
Whilethisimageisbetter,therangeofgraysislimited.Wecreatedalightershadeofgrayandnota
bettershadeofgray.Tocreatearichersetofgrays,weexpandtheimagefrombeinga16-bitgrayscaleto
32bitsandcolor.Thecolorgrayoccurswhenthecolors(red,blue,andgreen)havethesamevalue.This
givesusarangefrom0to255.Zeroisblack,255iswhite,andeverythingelseinbetweenisashadeof
gray.Tomakeiteasiertoswitchbetweenthetwoprocesseddepthimages,wecreateanewversionof
themethod,asshowninListing3-8.
Listing3-8.TheDepthImageInaBetterShadeofGray
private void CreateBetterShadesOfGray(DepthImageFrame depthFrame , short[] pixelData)
{
int depth;
int gray;
int loThreshold
= 1220;
int hiThreshold
= 3048;
int bytesPerPixel
= 4;
byte[] enhPixelData
= new byte[depthFrame.Width * depthFrame.Height * bytesPerPixel];
for(int i = 0, j = 0; i < pixelData.Length; i++, j += bytesPerPixel)
{
depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
if(depth < loThreshold || depth > hiThreshold)
{
gray = 0xFF;
}
else
{
gray = (255 * depth / 0xFFF);
}
enhPixelData[j]
= (byte) gray;
58
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
enhPixelData[j + 1]
enhPixelData[j + 2]
= (byte) gray;
= (byte) gray;
}
}
EnhancedDepthImage.Source = BitmapSource.Create(depthFrame.Width, depthFrame.Height,
96, 96, PixelFormats.Bgr32, null,
enhPixelData,
depthFrame.Width * bytesPerPixel);
ThecodeinboldinListing3-8representsthedifferencesbetweenthisprocessingofthedepth
imageandthepreviousattempt(Listing3-7).ThecolorimageformatchangestoBgr32,whichmeans
thereareatotalof32bits(4bytesperpixel).Eachcolorgets8bitsandthereare8unusedbits.This
limitsthenumberofpossiblegraysto255.Anyvalueoutsideofthethresholdrangeissettothecolor
white.Allotherdepthsarerepresentedinshadesofgray.Theintensityofthegrayistheresultofdividing
thedepthbythe4095(0xFFF),whichisthelargestpossibledepthvalue,andthenmultiplyingby255.
Figure3-6showsthethreedifferentdepthimagesdemonstratedsofarinthechapter.
Figure3-6.Differentvisualizationsofthedepthimage—fromlefttoright:rawdepthimage,depthimage
fromListing3-7,anddepthimagefromListing3-8
ColorDepth
Theenhanceddepthimageproducesashadeofgrayforeachdepthvalue.Therangeofgraysisonly0to
255,whichismuchlessthanourrangeofdepthvalues.Usingcolorstorepresenteachdepthvaluegives
moredepthtothedepthimage.Whiletherearecertainlymoreadvancedtechniquesfordoingthis,a
simplemethodistoconvertthedepthvaluesintohueandsaturationvalues.Listing3-9showsan
exampleofonewaytocolorizeadepthimage.
59
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-9.ColoringtheDepthImage
private void CreateColorDepthImage(DepthImageFrame depthFrame , short[] pixelData)
{
int depth;
double hue;
int loThreshold
= 1220;
int hiThreshold
= 3048;
int bytesPerPixel
= 4;
byte[] rgb
= new byte[3];
byte[] enhPixelData = new byte[depthFrame.Width * depthFrame.Height * bytesPerPixel];
for(int i = 0, j = 0; i < pixelData.Length; i++, j += bytesPerPixel)
{
depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
if(depth < loThreshold || depth > hiThreshold)
{
enhPixelData[j]
= 0x00;
enhPixelData[j + 1] = 0x00;
enhPixelData[j + 2] = 0x00;
}
else
{
hue = ((360 * depth / 0xFFF) + loThreshold);
ConvertHslToRgb(hue, 100, 100, rgb);
}
}
enhPixelData[j]
= rgb[2];
enhPixelData[j + 1] = rgb[1];
enhPixelData[j + 2] = rgb[0];
//Blue
//Green
//Red
EnhancedDepthImage.Source = BitmapSource.Create(depthFrame.Width, depthFrame.Height,
96, 96, PixelFormats.Bgr32, null,
enhPixelData,
depthFrame.Width * bytesPerPixel);
}
Huevaluesaremeasuredindegreesofacircleandrangefrom0to360.Thehuevalueis
proportionaltothedepthoffsetintegerandthedepththreshold.TheConvertHslToRgbmethodusesa
commonalgorithmtoconverttheHSLvaluestoRGBvalues,andisincludedinthedownloadablecode
forthisbook.Thisexamplesetsthesaturationandlightnessvaluesto100%.
TherunningapplicationgeneratesadepthimagelikethelastimageinFigure3-7.Thefirstimagein
thefigureistherawdepthimage,andthemiddleimageisgeneratedfromListing3-8.Depthscloserto
thecameraareshadesofblue.Theshadesofbluetransitiontopurple,andthentoredthefartherfrom
Kinecttheobjectis.Thevaluescontinuealongthisscale.
60
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-7.Colordepthimagecomparedtograyscale
Youwillnoticethattheperformanceoftheapplicationissuddenlymarkedlysluggish.Ittakesa
copiousamountofworktoconverteachpixel(640480=307200pixels!)intoacolorvalueusingthis
method.WedonotrecommendyoudothisworkontheUIthreadaswehaveinthisexample.Abetter
approachistodothisworkonabackgroundthread.EachtimetheKinectSensorfirestheframe-ready
event,yourcodestorestheframeinaqueue.Abackgroundthreadwouldcontinuouslyconvertthenext
frameinthequeuetoacolorimage.Aftertheconversion,thebackgroundthreadusesWPF’sDispatcher
toupdatetheImagesourceontheUIthread.Thistypeofapplicationarchitectureisverycommonin
Kinect-basedapplications,becausetheworknecessarytoprocessthedepthdataisperformance
intensive.ItisbadapplicationdesigntodothistypeofworkontheUIasitwilllowertheframerateand
ultimatelycreateabaduserexperience.
SimpleDepthImageProcessing
Tothispoint,wehaveextractedthedepthvalueofeachpixelandcreatedimagesfromthedata.In
previousexamples,wefilteredoutpixelsthatwerebeyondcertainthresholdvalues.Thisisaformof
imageprocessing,notsurprisinglycalledthresholding.Ouruseofthresholding,whilecrude,suitsour
needs.Moreadvancedprocessesusemachinelearningtocalculatethresholdvaluesforeachframe.
NoteKinectreturns4096(0to4095)possibledepthvalues.Sinceazerovaluealwaysmeansthedepthis
undeterminable,itcanalwaysbefilteredout.Microsoftrecommendsusingonlydepthsfrom4to12.5feet.Before
doinganyotherdepthprocessing,youcanbuildthresholdsintoyourapplicationandonlyprocessdepthranging
from1220(4’)to3810(12.5’).
Usingstatisticsiscommonwhenprocessingdepthimagedata.Thresholdscanbecalculatedbased
onthemeanormedianofdepthvalues.Probabilitieshelpdetermineifapixelisnoise,ashadow,or
somethingofgreatermeaning,suchasbeingpartofauser’shand.Ifyouallowyourmindtoforgetthe
visualmeaningofapixel,ittransitionsintorawdataatwhichpointdataminingtechniquesbecome
applicable.Themotivationbehindprocessingdepthpixelsistoperformshapeandobjectrecognition.
Withthisinformation,applicationscandeterminewhereauserisinrelationtoKinect,wherethatuser’s
handis,andifthathandisintheactofwaving.
61
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Histograms
Thehistogramisatoolfordeterminingstatisticaldistributionsofdata.Ourconcernisthedistributionof
depthdata.Histogramsvisuallytellthestoryofhowrecurrentcertaindatavaluesareforagivendataset.
Fromahistogramwediscernhowfrequentlyandhowtightlygroupeddepthvaluesare.Withthis
information,itispossibletomakedecisionsthatdeterminethresholdsandotherfilteringtechniques,
whichultimatelyrevealthecontentsofthedepthimage.Todemonstratethis,wenextbuildanddisplay
ahistogramfromadepthframe,andthenusesimpletechniquestofilterunwantedpixels.
Let’sstartfreshandcreateanewproject.Performthestandardstepsofdiscoveringandinitializing
aKinectSensorobjectfordepth-onlyprocessing,includingsubscribingtotheDepthFrameReadyevent.
Beforeaddingthecodetobuildthedepthhistogram,updatetheMainWindow.xamlwiththecode
showninListing3-10.
Listing3-10.DepthHistogramUI
<Window x:Class=" BeginningKinect.Chapter3.DepthHistograms.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="MainWindow" Height="800" Width="1200">
<Grid>
<StackPanel>
<StackPanel Orientation="Horizontal">
<Image x:Name="DepthImage" Width="640" Height="480"/>
<Image x:Name="FilteredDepthImage" Width="640" Height="480"/>
</StackPanel>
<ScrollViewer Margin="0,15" HorizontalScrollBarVisibility="Auto"
VerticalScrollBarVisibility="Auto">
<StackPanel x:Name="DepthHistogram" Orientation="Horizontal" Height="300"/>
</ScrollViewer>
</StackPanel>
</Grid>
</Window>
Ourapproachtocreatingthehistogramissimple.WecreateaseriesofRectangleelementsandadd
themtotheDepthHistogram(StackPanelelement).Whilethegraphwillnothaveahighfidelityforthis
demonstration,itservesuswell.Mostapplicationscalculatehistogramdataanduseitforinternal
processingonly.However,ifourintentweretoincludethehistogramdatawithintheUI,wewould
certainlyputmoreeffortintothelookandfeelofthegraph.Thecodetobuildanddisplaythehistogram
isshowninListing3-11.
Listing3-11.BuildingaDepthHistogram
private void KinectDevice_DepthFrameReady(object sender, ImageFrameReadyEventArgs e)
{
using(DepthImageFrame frame = e.OpenDepthImageFrame())
{
if(frame != null)
{
frame.CopyPixelDataTo(this._DepthPixelData);
CreateBetterShadesOfGray(frame, this._DepthPixelData); //See Listing 3-8
CreateDepthHistogram(frame, this._DepthPixelData);
62
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
}
}
}
private void CreateDepthHistogram(DepthImageFrame depthFrame , short[] pixelData )
{
int depth;
int[] depths
= new int[4096];
int maxValue
= 0;
double chartBarWidth
= DepthHistogram.ActualWidth / depths.Length;
DepthHistogram.Children.Clear();
//First pass - Count the depths.
for(int i = 0; i < pixels.Length; i += depthFrame.BytesPerPixel)
{
depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
}
if(depth != 0)
{
depths[depth]++;
}
//Second pass - Find the max depth count to scale the histogram to the space available.
//
This is only to make the UI look nice.
for(int i = 0; i < depths.Length; i++)
{
maxValue = Math.Max(maxValue, depths[i]);
}
}
//Third pass - Build the histogram.
for(int i = 0; i < depths.Length; i++)
{
if(depths[i] > 0)
{
Rectangle r
= new Rectangle();
r.Fill
= Brushes.Black;
r.Width
= chartBarWidth;
r.Height
= DepthHistogram.ActualHeight *
(depths[i] / (double) maxValue);
r.Margin
= new Thickness(1,0,1,0);
r.VerticalAlignment = System.Windows.VerticalAlignment.Bottom;
DepthHistogram.Children.Add(r);
}
}
Buildingthehistogramstartsbycreatinganarraytoholdacountforeachdepth.Thearraysizeis
4096,whichisthenumberofpossibledepthvalues.Thefirststepistoiteratethroughthedepthimage
63
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
pixels,extractthedepthvalue,andincrementthedepthcountinthedepthsarraytothefrequencyof
eachdepthvalue.Depthvaluesofzeroareignored,becausetheyrepresentout-of-rangedepths.Figure
3-8showsadepthimagewithahistogramofthedepthvalues.ThedepthvaluesarealongtheX-axis.
TheY-axisrepresentsthefrequencyofthedepthvalueintheimage.
Figure3-8.Depthimagewithhistogram
Asyouinteractwiththeapplication,itisinteresting(andcool)toseehowthegraphflowsand
changesasyoumovecloserandfartherawayfromKinect.Grabafriendandseetheresultsasmultiple
usersareinview.Anothertestistoadddifferentobjects,thelargerthebetter,andplacethemintheview
areatoseehowthisaffectsthehistogram.TakenoticeofthetwospikesattheendofthegraphinFigure
3-8.Thesespikesrepresentthewallinthispicture.ThewallisaboutsevenfeetfromKinect,whereasthe
userisroughlyfivefeetaway.Thisisanexampleofwhentoemploythresholding.Inthisinstance,itis
undesirabletoincludethewall.TheimagesshowninFigure3-9aretheresultofhardcodingathreshold
rangeof3-6.5feet.Noticehowthedistributionofdepthchanges.
64
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-9.Depthimagesandhistogramwiththewallfilteredoutoftheimage;thesecondimage(right)
showstheuserholdinganewspaperapproximatelytwofeetinfrontoftheuser
Whilewatchingtheundulationsinthegraphchangeinrealtimeisinteresting,youquicklybegin
wonderingwhatthenextstepsare.Whatelsecanwedowiththisdataandhowcanitbeusefulinan
application?Analysisofthehistogramcanrevealspeaksandvalleysinthedata.Byapplyingimageprocessingtechniques,suchasthresholdingtofilteroutdata,thehistogramdatacanrevealmoreabout
theimage.Furtherapplicationofotherdataprocessingtechniquescanreducenoiseornormalizethe
data,lesseningthedifferencesbetweenthepeaksandvalleys.Asaresultoftheprocessing,itthen
becomespossibletodetectedgesofshapesandblobsofpixels.Theblobsbegintotakeonrecognizable
shapes,suchaspeople,chairs,orwalls.
FurtherReading
Astudyofimage-processingtechniquesfallsfarbeyondthescopeofthischapterandbook.Thepurpose
hereisshowthatrawdepthdataisavailabletoyou,andhelpyouunderstandpossibleusesofthedata.
Morethanlikely,yourKinectapplicationwillnotneedtoprocessdepthdataextensively.For
applicationsthatrequiredepthdataprocessing,itquicklybecomesnecessarytousetoolslikethe
OpenCVlibrary.Depthimageprocessingisoftenresourceintensiveandneedstobeexecutedatalower
levelthanisachievablewithalanguagelikeC#.
65
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
NoteTheOpenCV(OpenSourceComputerVision–opencv.willowgarage.com)libraryisacollectionof
commonlyusedalgorithmsforprocessingandmanipulatingimages.ThisgroupisalsoinvolvedinthePointCloud
Library(PCL)andRobotOperatingSystem(ROS),bothofwhichinvolveintensiveprocessingofdepthdata.Anyone
lookingbeyondbeginner’smaterialshouldresearchOpenCV.
Themorecommonreasonanapplicationwouldprocessrawdepthdataistodeterminethe
positionsofusersinKinect’sviewarea.WhiletheMicrosoftKinectSDKactuallydoesmuchofthiswork
foryouthroughskeletontracking,yourapplicationneedsmaygobeyondwhattheSDKprovides.Inthe
nextsection,wewalkthroughtheprocessofeasilydetectingthepixelsthatbelongtousers.Before
movingon,youareencouragedtoresearchandstudyimage-processingtechniques.Belowareseveral
topicstohelpfurtheryourresearch:
•
•
ImageProcessing(general)
•
Thresholding
•
Segmentation
Edge/ContourDetection
•
Guaussianfilters
•
Sobel,Prewitt,andKirsh
•
Canny-edgedetector
•
Roberts’Crossoperator
•
HoughTransforms
•
BlobDetection
•
LaplacianoftheGuaussian
•
Hessianoperator
•
k-meansclustering
DepthandPlayerIndexing
TheSDKhasafeaturethatanalyzesdepthimagedataanddetectshumanorplayershapes.Itrecognizes
asmanysixplayersatatime.TheSDKassignsanumbertoeachtrackedplayer.Thenumberorplayer
indexisstoredinthefirstthreebitsofthedepthpixeldata(Figure3-10).Asdiscussedinanearlier
sectionofthischapter,eachpixelis16bits.Bits0to2holdtheplayerindexvalue,andbits3to15hold
thedepthvalue.Abitmaskof7(00000111)getstheplayerindexfromthedepthvalue.Foradetailed
explanationofbitmasks,refertoAppendixA.Fortunately,theKinectSDKdefinesapairofconstants
focusedontheplayerindexbits.TheyareDepthImageFrame.PlayerIndexBitmaskWidthand
DepthImageFrame.PlayerIndexBitmask.Thevalueoftheformeris3andthelatteris7.Yourapplication
shouldusetheseconstantsandnotusetheliteralvaluesasthevaluesmaychangeinfutureversionsof
theSDK.
66
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-10.Depthandplayerindexbits
Apixelwithaplayerindexvalueofzeromeansnoplayerisatthatpixel,otherwiseplayersare
numbered1to6.However,enablingonlythedepthstreamdoesnotactivateplayertracking.Player
trackingrequiresskeletontracking.WheninitializingtheKinectSensorobjectandtheDepthImageStream,
youmustalsoenabletheSkeletonStream.OnlywiththeSkeletonStreamenabledwillplayerindexvalues
appearinthedepthpixelbits.YourapplicationdoesnotneedtosubscribetotheSkeletonFrameReady
eventtogetplayerindexvalues.
Let’sexploretheplayerindexbits.CreateanewprojectthatdiscoversandinitializesaKinectSensor
object.EnablebothDepthImageStreamandSkeletonStream,andsubscribetotheDepthFrameReadyevent
ontheKinectSensorobject.IntheMainWindow.xamladdtwoImageelementsnamedRawDepthImageand
EnhDepthImage.Addthemembervariablesandcodetosupportcreatingimagesusingthe
WriteableBitmap.Finally,addthecodeinListing3-12.Thisexamplechangesthevalueofallpixels
associatedwithaplayertoblackandallotherpixelstowhite.Figure3-11showstheoutputofthiscode.
Forcontrast,thefigureshowstherawdepthimageontheleft.
67
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-12.DisplayingUsersinBlackandWhite
private void KinectDevice_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
using(DepthIm*ageFrame frame = e.OpenDepthImageFrame())
{
if(frame != null)
{
frame.CopyPixelDataTo(this._RawDepthPixelData);
this._RawDepthImage.WritePixels(this._RawDepthImageRect, this._RawDepthPixelData,
this._RawDepthImageStride, 0);
CreatePlayerDepthImage(frame, this._RawDepthPixelData);
}
}
}
private void GeneratePlayerDepthImage(DepthImageFrame depthFrame, short[] pixelData)
{
int playerIndex;
int depthBytePerPixel = 4;
byte[] enhPixelData
= new byte[depthFrame.Height * this._EnhDepthImageStride];
for(int i = 0, j = 0; i < pixelData.Length; i++, j += depthBytePerPixel)
{
playerIndex = pixelData[i] & DepthImageFrame.PlayerIndexBitmask;
}
}
if(playerIndex == 0)
{
enhPixelData[j]
enhPixelData[j +
enhPixelData[j +
}
else
{
enhPixelData[j]
enhPixelData[j +
enhPixelData[j +
}
= 0xFF;
1] = 0xFF;
2] = 0xFF;
= 0x00;
1] = 0x00;
2] = 0x00;
this._EnhDepthImage.WritePixels(this._EnhDepthImageRect, enhPixelData,
this._EnhDepthImageStride, 0);
68
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-11.Rawdepthimage(left)andprocesseddepthimagewithplayerindexing(right)
Thereareseveralpossibilitiesforenhancingthiscodewithcodewewroteearlierinthischapter.For
example,youcanapplyagrayscaletotheplayerpixelsbasedonthedepthandblackoutallotherpixels.
Insuchaproject,youcouldbuildahistogramoftheplayer’sdepthvalueandthendeterminethe
grayscalevalueofeachdepthinrelationtothehistogram.Anothercommonexerciseistoapplyasolid
colortoeachdifferentplayerwherePlayer1’spixelsred,Player2blue,Player3greenandsoon.The
KinectExplorersampleapplicationthatcomeswiththeSDKdoesthis.Youcouldapply,ofcourse,the
colorintensityforeachpixelbasedonthedepthvaluetoo.Sincethedepthdataisthedifferentiating
elementofKinect,youshouldusethedatawhereverandasmuchaspossible.
Asawordofcaution,donotcodetospecificplayerindexesastheyarevolatile.Theactualplayer
indexnumberisnotalwaysconsistentanddoesnotcoordinatewiththeactualnumberofvisibleusers.
Forexample,asingleusermightbeinviewofKinect,butKinectwillreturnaplayerindexofthreefor
thatuser’spixels.Todemonstratethis,updatethecodetodisplayalistofplayerindexesforallvisible
users.Youwillnoticethatsometimeswhenthereisonlyasingleuser,Kinectdoesnotalwaysidentify
thatuserasplayer1.Totestthisout,walkoutofview,waitforabout5seconds,andwalkbackin.Kinect
willidentifyyouasanewplayer.Grabseveralfriendstofurthertestthisbykeepingonepersoninviewat
alltimesandhavetheotherswalkinandoutofview.Kinectcontinuallytracksusers,butonceauserhas
lefttheviewarea,itforgetsaboutthem.Thisisjustsomethingtokeepinmindasyouasdevelopyour
Kinectapplication.
TakingMeasure
Aninterestingexerciseistomeasurethepixelsoftheuser.AsdiscussedintheMeasuringDepthsection
ofthischapter,theXandYpositionsofthepixelsdonotcoordinatetoactualwidthorheight
measurements;however,itispossibletocalculatethem.Everycamerahasafieldofview.Thefocal
lengthandsizeofthecamera’ssensordeterminestheanglesofthefield.TheMicrosoft’sKinectSDK
ProgrammingGuidetellsusthattheviewanglesare57degreeshorizontaland43vertical.Sincewe
knowthedepthvalues,wecandeterminethewidthandheightofaplayerusingtrigonometry,as
illustratedinFigure3-12,wherewecalculateaplayer’swidth.
69
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-12.Findingtheplayerrealworldwidth
Theprocessdescribedbelowisnotperfectandincertaincircumstancescanresultininaccurate
anddistortedvalues,however,sotooisthedatareturnedbyKinect.Theinaccuracyisduetothe
simplicityofthecalculationsandtheydonottakeintoaccountotherphysicalattributesoftheplayer
andspace.Despitethis,thevaluesareaccurateenoughformostuses.Themotivationhereistoprovide
anintroductoryexampleofhowKinectdatamapstotherealworld.Youareencouragedtoresearchthe
physicsbehindcameraopticsandfieldofviewsothatyoucanupdatethiscodetoensuretheoutputis
moreaccurate.
Letuswalkthroughthemathbeforedivingintothecode.AsFigure3-12shows,theangleofviewof
thecameraisanisoscelestrianglewiththeplayer’sdepthpositionformingthebase.Theactualdepth
valueistheheightofthetriangle.Wecanevenlysplitthetriangleinhalftocreatetworighttriangles,
whichallowsustocalculatethewidthofthebase.Onceweknowthewidthofthebase,wetranslate
pixelwidthsintoreal-worldwidths.Forexample,ifwecalculatethebaseofthetriangletohaveawidth
of1500mm(59in),theplayer’spixelwidthtobe100,andthepixelwidthoftheimagetobe320,thenthe
resultisaplayerwidthof468.75mm(18.45in).Forustoperformthecalculation,weneedtoknowthe
player’sdepthandthenumberofpixelswidetheplayerspans.Wetakeanaverageofdepthsforeachof
theplayer’spixels.Thisnormalizesthedepthbecauseinrealitynopersoniscompletelyflat.Ifthatwere
true,itcertainlywouldmakeourcalculationsmucheasier.Thecalculationisthesamefortheplayer’s
height,butwithadifferentangleandimagedimension.
Nowthatweknowthelogicweneedtoperform,letuswalkthroughthecode.Createanewproject
thatdiscoversandinitializesaKinectSensorobject.EnablebothDepthStreamandSkeletonStream,and
subscribetotheDepthFrameReadyeventontheKinectSensorobject.CodetheMainWindow.xamlto
matchListing3-13.
70
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-13. The UI for Measuring Players
<Window x:Class="BeginningKinect.Chapter3.TakingMeasure.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="MainWindow" Height="800" Width="1200">
<Grid>
<StackPanel Orientation="Horizontal">
<Image x:Name="DepthImage"/>
<ItemsControl x:Name="PlayerDepthData" Width="300" TextElement.FontSize="20">
<ItemsControl.ItemTemplate>
<DataTemplate>
<StackPanel Margin="0,15">
<StackPanel Orientation="Horizontal">
<TextBlock Text="PlayerId:"/>
<TextBlock Text="{Binding Path=PlayerId}"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<TextBlock Text="Width:"/>
<TextBlock Text="{Binding Path=RealWidth}"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<TextBlock Text="Height:"/>
<TextBlock Text="{Binding Path=RealHeight}"/>
</StackPanel>
</StackPanel>
</DataTemplate>
</ItemsControl.ItemTemplate>
</ItemsControl>
</StackPanel>
</Grid>
</Window>
ThepurposeoftheItemsControlistodisplayplayermeasurements.Ourapproachistocreatean
objecttocollectplayerdepthdataandperformthecalculationsdeterminingtherealwidthandheight
valuesoftheuser.Theapplicationmaintainsanarrayoftheseobjectsandthearraybecomesthe
ItemsSourcefortheItemsControl.TheUIdefinesatemplatetodisplayrelevantdataforeachplayer
depthobject,whichwewillcallPlayerDepthData.Beforecreatingthisclass,let’sreviewthecodethat
interfaceswiththeclassittoseehowtoitisused.Listing3-14showsamethodnamed
CalculatePlayerSize,whichiscalledfromtheDepthFrameReadyeventhandler.
71
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-14.CalculatingPlayerSizes
private void KinectDevice_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
using(DepthImageFrame frame = e.OpenDepthImageFrame())
{
if(frame != null)
{
frame.CopyPixelDataTo(this._DepthPixelData);
CreateBetterShadesOfGray(frame, this._DepthPixelData);
CalculatePlayerSize(frame, this._DepthPixelData);
}
}
}
private void CalculatePlayerSize(DepthImageFrame depthFrame, short[] pixelData)
{
int depth;
int playerIndex;
int pixelIndex;
int bytesPerPixel = depthFrame.BytesPerPixel;
PlayerDepthData[] players = new PlayerDepthData[6];
//First pass - Calculate stats from the pixel data
for(int row = 0; row < depthFrame.Height; row++)
{
for(int col = 0; col < depthFrame.Width; col++)
{
pixelIndex = col + (row * depthFrame.Width);
depth = pixelData[pixelIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth;
if(depth != 0)
{
playerIndex = (pixelData[pixelIndex] & DepthImageFrame.PlayerIndexBitmask);
playerIndex -= 1;
if(playerIndex > -1)
{
if(players[playerIndex] == null)
{
players[playerIndex] = new PlayerDepthData(playerIndex + 1,
depthFrame.Width,depthFrame.Height);
}
players[playerIndex].UpdateData(col, row, depth);
}
}
}
}
72
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
PlayerDepthData.ItemsSource = players;
}
TheboldlinesofcodeinListing3-14referenceusesofthePlayerDepthDataobjectinsomeway.The
logicoftheCalculatePlayerSizemethodgoespixelbypixelthroughthedepthimageandextractsthe
depthandplayerindexvalues.Thealgorithmignoresanypixelwithadepthvalueofzeroandnot
associatedwithaplayer.Foranypixelbelongingtoaplayer,thecodecallstheUpdateDatamethodonthe
PlayerDepthDataobjectofthatplayer.Afterprocessingallpixels,thecodesetstheplayer’sarraytobethe
sourcefortheItemsControlnamedPlayerDepthData.Therealworkofcalculatingeachplayer’ssizeis
encapsulatedwithinthePlayerDepthDataobject,whichwe’llturnourattentiontonow.
CreateanewclassnamedPlayerDepthData.ThecodeisshowninListing3-15.Thisobjectisthe
workhorseoftheproject.Itholdsandmaintainsplayerdepthdata,andcalculatesthereal-worldwidth
accordingly.
Listing3-15.ObjecttoHoldandMaintainPlayerDepthData
public class PlayerDepthData
{
#region Member Variables
private const double MillimetersPerInch
= 0.0393700787;
private static readonly double HorizontalTanA = Math.Tan(28.5 * Math.PI / 180);
private static readonly double VerticalTanA
= Math.Abs(Math.Tan(21.5 * Math.PI / 180));
private int _DepthSum;
private int _DepthCount;
private int _LoWidth;
private int _HiWidth;
private int _LoHeight;
private int _HiHeight;
#endregion Member Variables
#region Constructor
public PlayerDepthData(int playerId, double frameWidth, double frameHeight)
{
this.PlayerId
= playerId;
this.FrameWidth
= frameWidth;
this.FrameHeight
= frameHeight;
this._LoWidth
= int.MaxValue;
this._HiWidth
= int.MinValue;
this._LoHeight
= int.MaxValue;
this._HiHeight
= int.MinValue;
}
#endregion Constructor
#region Methods
public void UpdateData(int x, int y, int depth)
{
73
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
this._DepthCount++;
this._DepthSum
+=
this._LoWidth
=
this._HiWidth
=
this._LoHeight
=
this._HiHeight
=
}
#endregion Methods
depth;
Math.Min(this._LoWidth, x);
Math.Max(this._HiWidth, x);
Math.Min(this._LoHeight, y);
Math.Max(this._HiHeight, y);
#region Properties
public int PlayerId { get; private set; }
public double FrameWidth { get; private set; }
public double FrameHeight { get; private set; }
public double Depth
{
get { return this._DepthSum / (double) this._DepthCount; }
}
public int PixelWidth
{
get { return this._HiWidth - this._LoWidth; }
}
public int PixelHeight
{
get { return this._HiHeight - this._LoHeight; }
}
public double RealWidth
{
get
{
double opposite = this.Depth * HorizontalTanA;
return this.PixelWidth * 2 * opposite / this.FrameWidth * MillimetersPerInch;
}
}
public double RealHeight
{
get
{
double opposite = this.Depth * VerticalTanA;
return this.PixelHeight * 2 * opposite / this.FrameHeight * MillimetersPerInch;
}
}
74
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
}
#endregion Properties
TheprimaryreasonthePlayerDepthDataclassexistsistoencapsulatethemeasurementcalculations
andmaketheprocesseasiertounderstand.Theclassaccomplishesthisbyhavingtwoinputpointsand
twooutputs.TheconstructorandtheUpdateDatamethodarethetwoformsofinputandtheRealWidth
andRealHeightpropertiesaretheoutput.Thecodebehindeachoftheoutputpropertiescalculatesthe
resultbasedontheformulasdetailedinFigure3-12.Eachformulareliesonanormalizeddepthvalue,
measurementoftheframe(widthorheight),andthetotalpixelsconsumedbytheplayer.The
normalizeddepthandtotalpixelmeasurederivefromdatapassedtotheUpdateDatamethod.Thereal
widthandheightvaluesareonlyasgoodasthedatasuppliedtotheUpdateDatamethod.
Figure3-13showstheresultsofthisproject.Eachframeexhibitsauserindifferentposes.The
imagesshowaUIdifferentfromtheoneinourprojectinordertobetterillustratetheplayer
measurementcalculations.Thewidthandheightcalculationsadjustforeachalteredposture.Notethat
thewidthandheightvaluesareonlyforthevisiblearea.TakethefirstframeofFigure3-13.Theuser’s
heightisnotactually42inches,buttheheightoftheuserseenbyKinectis42inches.Theuser’sreal
heightis74inches,whichmeansthatonlyjustoverhalfoftheuserisvisible.Thewidthvaluehasa
similarcaveat.
Figure3-13.Playermeasurementsindifferentposes
75
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
AligningDepthandVideoImages
Inourpreviousexamples,wealteredthepixelsofthedepthimagetobetterindicatewhichpixelsbelong
tousers.Wecoloredtheplayerpixelsandalteredthecolorofthenon-playerpixels.However,thereare
instanceswhereyouwanttoalterthepixelsinthevideoimagebasedontheplayerpixels.Thereisan
effectusedbymoviemakerscalledgreenscreeningor,moretechnically,chromakeying.Thisiswherean
actorstandsinfrontofagreenbackdropandactsoutascene.Later,thebackdropiseditedoutofthe
sceneandreplacedwithsomeothertypeofbackground.Thisiscommoninsci-fimovieswhereitis
impossibletosendactorstoMars,forexample,toperformascene.Wecancreatethissametypeof
effectwithKinect,andtheMicrosoftSDKmakesthiseasy.Thecodetowritethistypeofapplicationis
notmuchdifferentfromwhatwehavealreadycodedinthischapter.
NoteThistypeofapplicationisabasicexampleofaugmentedrealityexperience.Augmentedreality
applicationsareextremelyfunandcaptivatinglyimmersiveexperiences.ManyartistsareusingKinecttocreate
augmentedrealityinteractiveexhibits.Additionally,thesetypesofexperiencesareusedastoolsforadvertising
andmarketing.
WeknowhowtogetKinecttotelluswhichpixelsbelongtousers,butonlyforthedepthimage.
Unfortunately,thepixelsofthedepthimagedonottranslateone-to-onewiththosecreatedbythecolor
stream,evenifyousettheresolutionsofeachstreamtothesamevalue.Thepixelsofthetwocameras
arenotalignedbecausetheyarepositionedonKinectjustliketheeyesonyourface.Youreyesseein
stereoinwhatiscalledstereovision.Closeyourlefteyeandnoticehowyourviewoftheworldis
different.Nowcloseyourrighteyeandopenyourleft.Whatyouseeisdifferentfromwhatyousawwhen
onlyyourrighteyewasopen.Whenbothofyoureyesareopen,yourbraindoestheworktomergethe
imagesyouseefromeacheyeintoone.
Thecalculationsrequiredtotranslatepixelsfromonecameratotheotherisnottrivialeither.
Fortunately,theSDKprovidesmethodsthatdotheworkforus.Themethodsarelocatedonthe
KinectSensorandarenamedMapDepthToColorImagePoint,MapDepthToSkeletonPoint,
MapSkeletonPointToColor,andMapSkeletonPointToDepth.TheDepthImageFrameobjecthasmethodswith
slightlydifferentnames,butfunctionthesame(MapFromSkeletonPoint,MapToColorImagePoint,and
MapToSkeletonPoint).Forthisproject,weusetheMapDepthToColorImagePointmethodtotranslatea
depthimagepixelpositionintoapixelpositiononacolorimage.Incaseyouarewondering,thereisnot
amethodtogetthedepthpixelbasedonthecoordinatesofacolorpixel.
CreateanewprojectandaddtwoImageelementstotheMainWindow.xamllayout.Thefirstimageis
thebackgroundandcanbehard-codedtowhateverimageyouwant.Thesecondimageisthe
foregroundandistheimagewewillcreate.Listing3-16showstheXAMLforthisproject.
76
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-16.GreenScreenAppUI
<Window x:Class="Apress.BeginningKinect.Chapter3.GreenScreen.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="MainWindow">
<Grid>
<Image Source="/WineCountry.JPG" />
<Image x:Name="GreenScreenImage"/>
</Grid>
</Window>
Inthisproject,wewillemploypollingtoensurethatthecoloranddepthframesareasclosely
alignedaspossible.ThecutoutismoreaccuratetheclosertheframesareinTimestamp,andevery
millisecondcounts.WhileitispossibletousetheAllFramesReadyeventontheKinectSensorobject,this
doesnotguaranteethattheframesgivenbytheeventargumentsarecloseintimewithoneanother.The
frameswillneverbeincompletesynchronization,butthepollingmodelgetstheframesascloseas
possible.Listing3-17showstheinfrastructurecodetodiscoveradevice,enablethestreams,andpollfor
frames.
Listing3-17.PollingInfrastructure
#region Member Variables
private KinectSensor _KinectDevice;
private WriteableBitmap _GreenScreenImage;
private Int32Rect _GreenScreenImageRect;
private int _GreenScreenImageStride;
private short[] _DepthPixelData;
private byte[] _ColorPixelData;
#endregion Member Variables
private void CompositionTarget_Rendering(object sender, EventArgs e)
{
DiscoverKinect();
if(this.KinectDevice != null)
{
try
{
ColorImageStream colorStream = this.KinectDevice.ColorStream;
DepthImageStream depthStream = this.KinectDevice.DepthStream;
using(ColorImageFrame colorFrame = colorStream.OpenNextFrame(100))
{
using(DepthImageFrame depthFrame = depthStream.OpenNextFrame(100))
{
RenderGreenScreen(this.KinectDevice, colorFrame, depthFrame);
}
77
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
}
}
}
}
catch(Exception)
{
//Handle exception as needed
}
private void DiscoverKinect()
{
if(this._KinectDevice != null && this._KinectDevice.Status != KinectStatus.Connected)
{
this._KinectDevice.ColorStream.Disable();
this._KinectDevice.DepthStream.Disable();
this._KinectDevice.SkeletonStream.Disable();
this._KinectDevice.Stop();
this._KinectDevice = null;
}
if(this._KinectDevice == null)
{
this._KinectDevice = KinectSensor.KinectSensors.FirstOrDefault(x => x.Status ==
KinectStatus.Connected);
if(this._KinectDevice != null)
{
this._KinectDevice.SkeletonStream.Enable();
this._KinectDevice.DepthStream.Enable(DepthImageFormat.Resolution640x480Fps30);
this._KinectDevice.ColorStream.Enable(ColorImageFormat.RgbResolution1280x960Fps12);
DepthImageStream depthStream = this._KinectDevice.DepthStream;
this._GreenScreenImage
= new WriteableBitmap(depthStream.FrameWidth,
depthStream.FrameHeight, 96, 96,
PixelFormats.Bgra32, null);
this._GreenScreenImageRect
= new Int32Rect(0, 0,
(int) Math.Ceiling(depthStream.Width),
(int) Math.Ceiling(depthStream.Height));
this._GreenScreenImageStride = depthStream.FrameWidth * 4;
this.GreenScreenImage.Source = this._GreenScreenImage;
this._DepthPixelData = new short[depthStream.FramePixelDataLength];
int colorFramePixelDataLength =
this._ColorPixelData = new
byte[this._KinectDevice.ColorStream.FramePixelDataLength];
}
this._KinectDevice.Start();
78
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
}
}
ThebasicimplementationofthepollingmodelinListing3-17shouldbecommonand
straightforwardbynow.Thereareafewlinesofcodeofnote,whicharemarkedinbold.Thefirstlineof
codeinboldisthemethodcalltoRenderGreenScreen.Commentoutthislineofcodefornow.We
implementitnext.Thenexttwolinescodeinbold,whichenablethecoloranddepthstream,factorinto
thequalityofourbackgroundsubtractionprocess.Whenmappingbetweenthecoloranddepthimages,
itisbestthatthecolorimageresolutionbetwicethatofthedepthstream,toensurethebestpossible
pixeltranslation.
TheRenderGreenScreenmethoddoestheactualworkofthisproject.Itcreatesanewcolorimageby
removingthenon-playerpixelsfromthecolorimage.Thealgorithmstartsbyiteratingovereachpixelof
thedepthimage,anddeterminesifthepixelhasavalidplayerindexvalue.Thenextstepistogetthe
correspondingcolorpixelforanypixelbelongingtoaplayer,andaddthatpixeltoanewbytearrayof
pixeldata.Allotherpixelsarediscarded.ThecodeforthismethodisshowninListing3-18.
Listing3-18.PerformingBackgroundSubstraction
private void RenderGreenScreen(KinectSensor kinectDevice, ColorImageFrame colorFrame,
DepthImageFrame depthFrame)
{
if(kinectDevice != null && depthFrame != null && colorFrame != null)
{
int depthPixelIndex;
int playerIndex;
int colorPixelIndex;
ColorImagePoint colorPoint;
int colorStride
= colorFrame.BytesPerPixel * colorFrame.Width;
int bytesPerPixel
= 4;
byte[] playerImage
= new byte[depthFrame.Height * this._GreenScreenImageStride];
int playerImageIndex
= 0;
depthFrame.CopyPixelDataTo(this._DepthPixelData);
colorFrame.CopyPixelDataTo(this._ColorPixelData);
for(int depthY = 0; depthY < depthFrame.Height; depthY++)
{
for(int depthX = 0; depthX < depthFrame.Width; depthX++,
playerImageIndex += bytesPerPixel)
{
depthPixelIndex = depthX + (depthY * depthFrame.Width);
playerIndex
= this._DepthPixelData[depthPixelIndex] &
DepthImageFrame.PlayerIndexBitmask;
if(playerIndex != 0)
{
colorPoint = kinectDevice.MapDepthToColorImagePoint(depthX, depthY,
this._DepthPixelData[depthPixelIndex],
colorFrame.Format, depthFrame.Format);
colorPixelIndex = (colorPoint.X * colorFrame.BytesPerPixel) +
79
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
(colorPoint.Y * colorStride);
playerImage[playerImageIndex] =
this._ColorPixelData[colorPixelIndex];
//Blue
playerImage[playerImageIndex + 1] =
this._ColorPixelData[colorPixelIndex + 1];
//Green
playerImage[playerImageIndex + 2] =
this._ColorPixelData[colorPixelIndex + 2];
//Red
playerImage[playerImageIndex + 3] = 0xFF;
//Alpha
}
}
}
}
}
this._GreenScreenImage.WritePixels(this._GreenScreenImageRect, playerImage,
this._GreenScreenImageStride, 0);
ThebytearrayplayerImageholdsthecolorpixelsbelongingtoplayers.Sincethedepthimageisthe
sourceofourplayerdatainput,itbecomesthelowestcommondenominator.Theimagecreatedfrom
thesepixelsisthesamesizeasthedepthimage.Unlikethedepthimage,whichusestwobytesperpixel,
theplayerimageusesfourbytesperpixel:blue,green,red,andalpha.Thealphabitsareimportantto
thisprojectastheydeterminethetransparencyofeachpixel.Theplayerpixelsgetsetto255(0xFF),
meaningtheyarefullyopaque,whereasthenon-playerpixelsgetavalueofzeroandaretransparent.
TheMapDepthToColorImagePoint takesinthedepthpixelcoordinatesandthedepth,andreturnsthe
colorcoordinates.Theformatofthedepthvaluerequiresmentioning.Themappingmethodrequires
therawdepthvalueincludingtheplayerindexbits;otherwisethereturnedresultisincorrect.
TheremainingcodeofListing3-16extractsthecolorpixelvaluesandstorestheminthe
playerImagearray.Afterprocessingalldepthpixels,thecodeupdatesthepixelsoftheplayerbitmap.
Runthisprogram,anditisquicklyapparenttheeffectisnotperfect.Itworksperfectlywhentheuser
standsstill.However,iftheusermovesquickly,theprocessbreaksdown,becausethedepthandcolor
framescannotstayaligned.NoticeinFigure3-14,thepixelsontheuser’sleftsidearenotcrispandshow
noise.Itispossibletofixthis,buttheprocessisnon-trivial.Itrequiressmoothingofthepixelsaround
theplayer.Forthebestresults,itisnecessarytomergeseveralframesofimagesintoone.Wepickthis
projectbackupinChapter8todemonstratehowtoolslikeOpenCVcandothisworkforus.
80
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-14.Avisittowinecountry
DepthNearMode
TheoriginalpurposeforKinectwastoserveasagamecontrolfortheXbox.TheXboxisprimarilyplayed
inalivingroomspacewheretheuserisafewfeetawayfromtheTVscreenandKinect.Aftertheinitial
release,developersallovertheworldbeganbuildingapplicationsusingKinectonPCs.Severalofthese
PC-basedapplicationsrequireKinecttoseeorfocusatamuchcloserrangethanisavailablewiththe
originalhardware.ThedevelopercommunitycalledonMicrosofttoupdateKinectsothatKinectcould
returndepthdatafordistancesnearerthan800mm(31.5inches).
MicrosoftansweredbyreleasingnewhardwarespeciallyconfiguredforuseonPCs.Thenew
hardwaregoesbythenameKinectforWindowsandtheoriginalhardwarebyKinectforXbox.TheKinect
forWindowsSDKhasanumberofAPIelementsspecifictothenewhardware.TheRangepropertysets
theviewrangeoftheKinectsensor.TheRangepropertyisofDepthRange—anenumerationwithtwo
options,asshowinginTable3-1.Alldepthrangesareinclusive.
81
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
Table3-1.DepthRangeValues
DepthRange
What it is
Normal
Setstheviewabledepthrangeto800mm(2’7.5”)–4000mm(13’1.48”).Hasan
integervalueof0.
Near
Setstheviewabledepthrangeto400mm(1’3.75”)–3000mm(9’10.11”).Hasan
integervalueof1.
TheRangepropertycanbechangeddynamicallywhiletheDepthImageStreamisenabledand
producingframes.Thisallowsfordynamicandquickchangesinfocusasneededwithouthavingto
restarttheKinectSensoroftheDepthImageStream.However,theRangepropertyissensitivetothetypeof
Kinecthardwarebeingused.AnychangetotheRangepropertytoDepthRange.NearwhenusingKinectfor
XboxhardwareresultsinanInvalidOperationExeceptionexceptionwithamessageof,“Thefeatureis
notsupportedbythisversionofthehardware.”NearmodeviewingisonlysupportedbyKinectfor
Windowshardware.
Twoadditionalpropertiesaccompanytheneardepthrangefeature.TheyareMinDepthand
MaxDepth.ThesepropertiesdescribetheboundariesofKinect’sdepthrange.Bothvaluesupdateonany
changetotheRangepropertyvalue.
Onefinalfeatureofnotewiththedepthstreamisthespecialtreatmentofdepthvaluesthatexceed
theboundariesofthedepthrange.TheDepthImageStreamdefinestwopropertiesnamedTooFarDepth
andTooNearDepth,whichgivetheapplicationmoreinformationabouttheoutofrangedepth.Thereare
instanceswhenadepthiscompletelyindeterminateandisgiveavalueequaltotheUnknownDepth
propertyontheDepthImageStream.
Summary
DepthisfundamentaltoKinect.Depthiswhatdifferentiatesitfromallotherinputdevices.
UnderstandinghowtoworkwithKinect’sdepthdataisequallyfundamentaltodevelopingKinect
experiences.AnyKinectapplicationthatdoesnotincorporatedepthisunderutilizingthehardware,and
ultimatelylimitingtheuserexperiencefromreachingitsfullestpotential.Whilenoteveryapplication
needstoaccessandprocesstherawdepthdata,asadeveloperorapplicationarchitectyouneedto
knowthisdataisavailableandhowtoexploitittothebenefitoftheuserexperience.Further,whileyour
applicationmaynotprocessthedatadirectly,itwillreceiveaderivativeofthedata.Kinectprocessesthe
originaldepthdatatodeterminewhichpixelsbelongtoeachuser.Theskeletontrackingengine
componentoftheSDKperformsmoreextensiveprocessingofdepthdatatoproduceuserskeleton
information.
Itislessfrequentforareal-worldKinectexperiencetousetherawdepthdatadirectly.Itismore
commontousethird-partytoolssuchasOpenCVtoprocessthisdata,aswewillshowinChapter9.
Processingofrawdepthdataisnotalwaysatrivialprocess.Italsocanhaveextremeperformance
demands.ThisalonemeansthatamanagedlanguagelikeC#isnotalwaysthebesttoolforthejob.This
isnottosayitisimpossible,butoftenrequiresalowerlevelofprocessingthanC#canprovide.Ifthe
kindofdepthimageprocessingyouwanttodoisunachievablewithanexistingthird-partylibrary,
createyourownC/C++librarytodotheprocessing.YourWPFapplicationcanthenuseit.
Depthdatacomesintwoforms.TheSDKdoessomeprocessingtodeterminewhichpixelsbelongto
aplayer.Thisispowerfulinformationtohaveandprovidesabasisforatleastrudimentaryimage
processingtobuildinteractiveexperiences.Bycreatingsimplestatisticsaroundaplayer’sdepthvalues,
82
www.it-ebooks.info
CHAPTER3DEPTHIMAGEPROCESSING
wecantellwhenaplayerisinviewareaofKinectormorespecifically,wheretheyareinrelationtothe
entireviewingarea.Usingthedatayourapplicationcouldperformsomeactionlikeplayasoundclipof
applausewhenKinectdetectsanewuser,oraseries“boo”soundswhenauserleavestheviewarea.
However,beforeyourunoffandstartwritingcodethatdoesthis,waituntilnextchapterwhenwe
introduceskeletontracking.TheKinectforWindowsSDK’sskeletontrackingenginemakesthisan
easiertask.Thepointisthatthedataisavailableforyoutouseifyourapplicationneedsit.
Calculatingthedimensionsofobjectsinreal-worldspaceisonereasontoprocessdepthdata.In
ordertodothisyoumustunderstandthephysicsbehindcameraoptics,andbeproficientin
trigonometry.TheviewanglesofKinectcreatetriangles.Sinceweknowtheanglesofthesetrianglesand
thedepth,wecanmeasureanythingintheviewfield.Asanaside,imaginehowproudyourhighschool
trigteacherwouldbetoknowyouareusingtheskillsshetaughtyou.
Thelastprojectofthischapterconverteddepthpixelcoordinatesintocolorstreamcoordinatesto
performbackgroundsubtractiononanimage.Theexamplecodedemonstratedaverysimpleand
practicalusecase.Ingaming,itismorecommontouseanavatartorepresenttheuser.However,many
otherKinectexperiencesincorporatethevideocameraanddepthdata,withaugmentedrealityconcepts
beingthemostcommon.
Finally,thischaptercoveredtheneardepthmodeavailableonKinectforWindowshardware.Using
asimplesetofproperties,anapplicationcandynamicallychangethedepthrangeviewablebyKinect.
Thisconcludescoverageofthedepthstream.Nowlet’smoveontoreviewtheskeletonstream.
83
www.it-ebooks.info
CHAPTER 4
Skeleton Tracking
TherawdepthdataproducedbyKinecthaslimiteduses.Tobuildtrulyinteractive,fun,andmemorable
experienceswithKinect,weneedmoreinformationbeyondjustthedepthofeachpixel.Thisiswhere
skeletontrackingcomesin.Skeletontrackingistheprocessingofdepthimagedatatoestablishthe
positionsofvariousskeletonjointsonahumanform.Forexample,skeletontrackingdetermineswhere
auser’shead,hands,andcenterofmassare.SkeletontrackingprovidesX,Y,andZvaluesforeachof
theseskeletonpoints.Inthepreviouschapter,weexploredelementarydepthimageprocessing
techniques.Skeletontrackingsystemsgobeyondourintroductoryimageprocessingroutines.They
analyzedepthimagesemployingcomplicatedalgorithmsthatusematrixtransforms,machinelearning,
andothermeanstocalculateskeletonpoints.
Inthefirstsectionofthischapter,webuildanapplicationthatworkswithallofthemajorobjectsof
theskeletontrackingsystem.Whatfollowsisathoroughexaminationoftheskeletontrackingobject
model.Itisimportanttoknowwhatdatatheskeletontrackingengineprovidesyou.Wenextproceedto
buildingacompletegameusingtheKinectandskeletontracking.Weuseeverythinglearnedinthis
chaptertobuildthegame,whichwillserveasaspringboardforotherKinectexperiences.Thechapter
concludeswithanexaminationofahardwarefeaturethatcanimprovethequalityoftheskeleton
tracking.
Theanalogythatyoumustwalkbeforeyourunappliestothisbook.Uptoandincludingthis
chapterwehavebeenlearningtowalk.Afterthischapterwerun.Thefundamentalsofskeletontracking
learnedherecreateafoundationforthenexttwochaptersandeveryKinectapplicationyouwritegoing
forward.YouwillfindthatinvirtuallyeveryapplicationyoucreateusingKinect,thevastmajorityofyour
codewillfocusontheskeletontrackingobjects.Aftercompletingthischapter,wewillhavecoveredall
componentsoftheKinectforWindowsSDKdealingwithKinect’scameras.Westartwithanapplication
thatdrawsstickfiguresfromskeletondataproducedbytheSDK’sskeletonstream.
SeekingSkeletons
OurgoalistobeabletowriteanapplicationthatdrawstheskeletonofeveryuserinKinect’sviewarea.
Beforejumpingintocodeandworkingwithskeletondata,weshouldfirstwalkthroughthebasicoptions
andseehowtogetskeletondata.Itisalsohelpfultoknowtheformatofthedatasothatwecanperform
anynecessarydatamanipulation.However,theintentisforthisexaminationtobebrief,sowe
understandjustenoughoftheskeletonobjectsanddatatodrawskeletons.
SkeletondatacomesfromtheSkeletonStream.Datafromthisstreamisaccessibleeitherfrom
eventsorbypollingsimiliarilytothecoloranddepthstreams.Inthiswalkthrough,weuseeventssimply
becauseitistakeslesscodeandisamorecommonandbasicapproach.TheKinectSensorobjecthasan
eventnamedSkeletonFrameReady,whichfireseachtimenewskeletondatabecomesavailable.Skeleton
dataisalsoavailablefromtheAllFramesReadyevent.Welookattheskeletontrackingobjectmodelin
greaterdetailshortly,butfornow,weareonlyconcernourselveswithgettingskeletondatafromthe
85
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
stream.EachframeoftheSkeletonStreamproducesacollectionofSkeletonobjects.EachSkeleton
objectcontainsdatathatdescribeslocationofskeletonandtheskeleton’sjoints.Eachjointhasan
idenitiy(head,shoulder,elbow,etc.)anda3Dvector.
Nowlet’swritesomecode.CreateanewprojectwithareferencetotheMicrosoft.Kinectdll,and
addthebasicboilerplatecodeforcapturingaconnectedKinectsensor.Beforestartingthesensor,
enabletheSkeletonStreamandsubscribetotheSkeletonFrameReadyevent.Ourfirstprojecttointroduce
skeletontrackingdoesnotusethevideoordepthstreams.Theinitializationshouldappearasin
Listing4-1.
Listing4-1.SimpleSkeletonTrackingInitialization
#region Member Variables
private KinectSensor _KinectDevice;
private readonly Brush[] _SkeletonBrushes;
private Skeleton[] _FrameSkeletons;
#endregion Member Variables
#region Constructor
public MainWindow()
{
InitializeComponent();
this._SkeletonBrushes = new [] { Brushes.Black, Brushes.Crimson, Brushes.Indigo,
Brushes.DodgerBlue, Brushes.Purple, Brushes.Pink };
KinectSensor.KinectSensors.StatusChanged += KinectSensors_StatusChanged;
this.KinectDevice = KinectSensor.KinectSensors
.FirstOrDefault(x => x.Status == KinectStatus.Connected);
}
#endregion Constructor
#region Methods
private void KinectSensors_StatusChanged(object sender, StatusChangedEventArgs e)
{
switch (e.Status)
{
case KinectStatus.Initializing:
case KinectStatus.Connected:
this.KinectDevice = e.Sensor;
break;
case KinectStatus.Disconnected:
//TODO: Give the user feedback to plug-in a Kinect device.
this.KinectDevice = null;
break;
default:
//TODO: Show an error state
break;
}
}
#endregion Methods
86
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
#region Properties
public KinectSensor KinectDevice
{
get { return this._KinectDevice; }
set
{
if(this._KinectDevice != value)
{
//Uninitialize
if(this._KinectDevice != null)
{
this._KinectDevice.Stop();
this._KinectDevice.SkeletonFrameReady -= KinectDevice_SkeletonFrameReady;
this._KinectDevice.SkeletonStream.Disable();
this._FrameSkeletons = null;
}
this._KinectDevice = value;
//Initialize
if(this._KinectDevice != null)
{
if(this._KinectDevice.Status == KinectStatus.Connected)
{
this._KinectDevice.SkeletonStream.Enable();
this._FrameSkeletons = new
Skeleton[this._KinectDevice.SkeletonStream.FrameSkeletonArrayLength];
this.KinectDevice.SkeletonFrameReady +=
KinectDevice_SkeletonFrameReady;
this._KinectDevice.Start();
}
}
}
}
}
#endregion Properties
Takenoteofthe_FrameSkeletonsarrayandhowthearraymemoryisallocatedduringstream
initialization.ThenumberofskeletonstrackedbyKinectisconstant.Thisallowsustocreatethearray
onceanduseitthroughoutthelifeoftheapplication.Conveniently,theSDKdefinesaconstantforthe
arraysizeontheSkeletonStream.ThecodeinListing4-1alsodefinesanarrayofbrushes.Thesebrushed
willbeusedtocolorthelinesconnectingskeletonjoints.Youarewelcometocustomizethebrushcolors
tobeyourfavoritecolorsinsteadoftheonesinthecodelisting.
ThecodeinListing4-2showstheeventhandlerfortheSkeletonFrameReadyevent.Eachtimethe
eventhandlerexecutesitretrievesthecurrentframebycallingtheOpenSkeletonFramemethodonthe
eventargumentparameter.Theremainingcodeiteratesovertheframe’sarrayofSkeletonobjectsand
drawslinesontheUIthatconnecttheskeletonjoints.Thiscreatesastickfigureforeachskeleton.The
UIforourapplicationissimple.ItisonlyaGridelementnamed“LayoutRoot”andwiththebackground
settowhite.
87
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Listing4-2.ProducingStickFigures
private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
using(SkeletonFrame frame = e.OpenSkeletonFrame())
{
if(frame != null)
{
Polyline figure;
Brush userBrush;
Skeleton skeleton;
LayoutRoot.Children.Clear();
frame.CopySkeletonDataTo(this._FrameSkeletons);
for(int i = 0; i < this._FrameSkeletons.Length; i++)
{
skeleton = this._FrameSkeletons[i];
if(skeleton.TrackingState == SkeletonTrackingState.Tracked)
{
userBrush = this._SkeletonBrushes[i % this._SkeletonBrushes.Length];
//Draws the skeleton’s head and torso
joints = new [] { JointType.Head, JointType.ShoulderCenter,
JointType.ShoulderLeft, JointType.Spine,
JointType.ShoulderRight, JointType.ShoulderCenter,
JointType.HipCenter, JointType.HipLeft,
JointType.Spine, JointType.HipRight,
JointType.HipCenter });
LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints));
//Draws the skeleton’s left leg
joints = new [] { JointType.HipLeft, JointType.KneeLeft,
JointType.AnkleLeft, JointType.FootLeft };
LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints));
//Draws the skeleton’s right leg
joints = new [] { JointType.HipRight, JointType.KneeRight,
JointType.AnkleRight, JointType.FootRight };
LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints));
//Draws the skeleton’s left arm
joints = new [] { JointType.ShoulderLeft, JointType.ElbowLeft,
JointType.WristLeft, JointType.HandLeft };
LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints));
//Draws the skeleton’s right arm
joints = new [] { JointType.ShoulderRight, JointType.ElbowRight,
JointType.WristRight, JointType.HandRight };
LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints));
88
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
}
}
}
}
}
Eachtimeweprocessaskeleton,ourfirststepistodetermineifwehaveanactualskeleton.One
waytodothisiswiththeTrackingStatepropertyoftheskeleton.Onlythoseusersactivelytrackedbythe
skeletontrackingenginearedrawn.Weignoreprocessinganyskeletonsthatarenottrackingauser(that
is,theTrackingStateisnotequaltoSkeletonTrackingState.Tracked).WhileKinectcandetectuptosix
users,itonlytracksjointpositionsfortwo.WeexplorethisandtheTrackingStatepropertyingreater
depthlaterinthechapter.
Theprocessingperformedontheskeletondataissimple.Weselectabrushtocolorthestickfigure
basedonthepositionoftheplayerinthecollection.Next,wedrawthestickfigure.Youcanfindthe
methodsthatcreatetheactualUIelementsinListing4-3.TheCreateFiguremethoddrawstheskeleton
stickfigureforasingleskeletonobject.TheGetJointPointmethodiscriticaltodrawingthestickfigure.
ThismethodtakesthepositionvectorofthejointandcallstheMapSkeletonPointToDepthmethodonthe
KinectSensorinstancetoconverttheskeletoncoordinatestothedepthimagecoordinates.Laterinthe
chapter,wediscusswhythisconversionisnecessaryanddefinethecoordinatesystemsinvolved.Atthis
point,thesimpleexplanationisthattheskeletoncoordinatesarenotthesameasdepthorvideoimage
coordinates,oreventheUIcoordinates.Theconversionofcoordinatesystemsorscalingfromone
coordinatesystemtoanotherisquitecommonwhenbuildingKinectapplications.Thecentraltakeaway
istheGetJointPointmethodconvertstheskeletonjointfromtheskeletoncoordinatesystemintotheUI
coordinatesystemandreturnsapointintheUIwherethejointshouldbe.
Listing4-3.DrawingSkeletonJoints
private Polyline CreateFigure(Skeleton skeleton, Brush brush, JointType[] joints)
{
Polyline figure
= new Polyline();
figure.StrokeThickness = 8;
figure.Stroke
= brush;
for(int i = 0; i < joints.Length; i++)
{
figure.Points.Add(GetJointPoint(skeleton.Joints[joints[i]]));
}
}
return figure;
private Point GetJointPoint(Joint joint)
{
DepthImagePoint point = this.KinectDevice.MapSkeletonPointToDepth(joint.Position,
this.KinectDevice.DepthStream.Format);
point.X *= (int) this.LayoutRoot.ActualWidth / this.KinectDevice.DepthStream.FrameWidth;
point.Y *= (int) this.LayoutRoot.ActualHeight / this.KinectDevice.DepthStream.FrameHeight;
}
return new Point(point.X, point.Y);
89
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
ItisalsoimportanttopointoutwearediscardingtheZvalue.ItseemsawasteforKinecttodoa
bunchofworktoproduceadepthvalueforeveryjointandthenforustonotusethisdata.Inactuality,
weareusingtheZvalue,butnotexplicitly.Itisjustnotusedintheuserinterface.Thecoordinatespace
conversionrequiresthedepthvalue.TestthisyourselfbycallingtheMapSkeletonPointToDepthmethod,
passingintheXandYvalueofjoint,andsettingtheZvaluetozero.TheoutcomeisthatthedepthXand
depthYvariablesalwaysreturnas0.Asanadditionalexercise,usethedepthvaluetoapplya
ScaleTransformtotheskeletonfiguresbasedontheZvalue.Thescalevaluesareinverselyproportional
tothedepthvalue.Thismeansthatthesmallerthedepthvalue,thelargerthescalevalue,sothatthe
closerauseristotheKinect,thelargertheskeleton.
Compileandruntheproject.TheoutputshouldbesimilartothatshowninFigure4-1.The
applicationdisplaysacoloredstickfigureforeachskeleton.Grabafriendandwatchittrackyour
movements.Studyhowtheskeletontrackingreactstoyourmovements.Trydifferentposesand
gestures.MovetowhereonlyhalfofyourbodyisinKinect’sviewareaandnotehowtheskeleton
drawingsbecomejumbled.Walkinandoutoftheviewareaandnoticethattheskeletoncolorschange.
Thesometimes-strangebehavioroftheskeletondrawingsbecomesclearinthenextsectionwherewe
exploretheskeletonAPIingreaterdepth.
90
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Figure4-1.Stickfiguresgeneratedfromskeletondata
TheSkeletonObjectModel
Therearemoreobjects,structures,andenumerationsassociatedwithskeletontrackingthananyother
featureoftheSDK.Infact,skeletontrackingaccountsforoverathirdoftheentireSDK.Skeleton
trackingisobviouslyasignificantcomponent.Figure4-2illustratestheprimaryelementsoftheskeleton
tracking.Therearefourmajorelements(SkeletonStream,SkeletonFrame,SkeletonandJoint)and
severalsupportingparts.Thesubsectionstofollowdescribeindetailthemajorobjectsandstructure.
91
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Figure4-2.Skeletonobjectmodel
92
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
SkeletonStream
TheSkeletonStreamgeneratesSkeletonFrameobjects.RetievingframedatafromtheSkeletonStream is
similartotheColorStream and DepthStream.Applicationsretrieveskeletondataeitherfromthe
SkeletonFrameReadyevent,AllFramesReadyeventorfromtheOpenNextFramemethod.Note,thatcalling
theOpenNextFramemethodaftersubscribingtotheSkeletonFrameReadyeventoftheKinectSensorobject
resultsinanInvalidOperationExceptionexception.
EnablingandDisabling
TheSkeletonStreamdoesnotproduceanydatauntilenabled.Bydefault,itisdisabled.Toactivatethe
SkeletonStreamsothatitbeginsgeneratingdata,callitsEnabledmethod.Thereisalsoamethodnamed
Disable,whichsuspendstheproductionofskeletondata.TheSkeletonStreamobjectalsohasaproperty
namedIsEnabled,whichdescribesthecurrentstateofskeletondataproduction.The
SkeletonFrameReadyeventonaKinectSensorobjectdoesnotfireuntiltheSkeletonStreamisenabled.If
choosingtoemployapollingarchitecture,theSkeletonStreammustbeenabledbeforecallingthe
OpenNextFramemethod;otherwise,thecallthrowsanInvalidOperationExceptionexception.
Inmostapplications,onceenabledtheSkeletonStreamisunlikelytobedisabledduringthelifetime
oftheapplication.However,thereareinstanceswhereitisdesirabletodisablethestream.Oneexample
iswhenusingmultipleKinectsinanapplication—anadvancedtopicthatisnotcoveredinthisbook.
NotethatonlyoneKinectcanreportskeletondataforeachprocess.Thismeansthatevenwithmultiple
Kinectsrunning,theapplicationisstilllimitedtotwoskeletons.Theapplicationmustthenchooseon
whichKinecttoenableskeletontracking.Duringapplicationexecution,itisthenpossibletochange
whichKinectisactivelytrackingskeletonsbydisablingtheSkeletonStreamofoneKinectandenabling
thatoftheother.
Anotherreasontodisableskeletondataproductionisforperformance.Skeletonprocessingisan
expensiveoperation.ThisisobviousbywatchingtheCPUusageofanapplicationwithskeletontracking
enabled.Toseethisforyourself,openWindowsTaskManagerandrunthestickfigureapplicationfrom
thefirstsectionofthischapter.ThestickfiguredoesverylittleyetithasarelativelyhighCPUusage.This
isaresultofskeletontracking.Disablingskeletontrackingisusefulwhenyourapplicationdoesnotneed
skeletondata,andinsomeinstances,disablingskeletontrackingmaybenecessary.Forexample,ina
gamesomeeventmighttriggeracomplexanimationorcutscenevideo.Forthedurationofthe
animationorvideosequenceskeletondataisnotneeded.Disablingskeletontrackingmightalsobe
necessarytoensureasmoothanimationsequenceorvideoplayback.
ThereisasideeffecttodisablingtheSkeletonStream.Allstreamdataproductionstopsandrestarts
whentheSkeletonStreamchangesstate.Thisisnotthecasewhenthecolorordepthstreamsare
disabled.AchangeintheSkeletonStreamstatecausesthesensortoreinitialize.Thisprocessresetsthe
TimeStampandFrameNumberofallframestozero.Thereisalsoaslightlagwhenthesensorreinitializes,
butitisonlyafewmilliseconds.
Smoothing
Asyougainexperienceworkingwithskeletaldata,younoticeskeletonmovementisoftenjumpy.There
areseveralpossiblecausesofthis,rangingfrompoorapplicationperformancetoauser’sbehavior
(manyfactorscancauseapersontoshakeornotmovesmoothly),totheperformanceoftheKinect
hardware.Thevarianceofajoint’spositioncanberelativelylargefromframetoframe,whichcan
negativelyaffectanapplicationindifferentways.Inadditiontocreatinganawkwarduserexperience
93
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
andbeingaestheticallydispleasing,itisconfusingtouserswhentheiravatarorhandcursorappears
shakyorworseconvulsive.
TheSkeletonStreamhasawaytosolvethisproblembynormalizingpositionvaluesbyreducingthe
varianceinjointpositionsfromframetoframe.WhenenablingtheSkeletonStreamusingtheoverloaded
EnablemethodandpassinaTransformSmoothParametersstructure.ForreferencetheSkeletonStreamhas
tworead-onlypropertiesforsmoothingnamedIsSmoothingEnabledandSmoothParameters.The
IsSmoothingEnabledpropertyissettotruewhenthestreamisenabledwithaTransformSmoothParameters
andfalsewhenthedefaultEnablemethodisused.TheSmoothParameterspropertystoredthedefined
smoothingparameters.TheTransformSmoothParametersstructuredefinestheseproperties:
•
Correction–Takesafloatrangingfrom0to1.0.Thelowerthenumber,themore
correctionisapplied.
•
JitterRadius–Setstheradiusofcorrection.Ifajointposition“jitters”outsideof
thesetradius,itiscorrectedtobeattheradius.Thepropertyisafloatvalue
measuredinmeters
•
MaxDeviationRadius–UsedthissettinginconjunctionwiththeJitterRadius
settingtodeterminetheouterboundsofthejitterradius.Anypointthatfalls
outsideofthisradiusisnotconsideredajitter,butavalidnewposition.The
propertyisafloatvaluemeasuredinmeters.
•
Prediction–Returnsthenumberofframespredicted.
•
Smoothing–Determinestheamountofsmoothingappliedwhileprocessing
skeletalframes.Itisafloattypewitharangeof0to1.0.Thehigherthevalue,the
moresmoothingapplied.Azerovaluedoesnotaltertheskeletondata.
Smoothingskeletonjitterscomesatacost.Themoresmoothingapplied,themoreadverselyit
affectstheapplication’sperformance.Settingthesmoothingparametersismoreofanartthanascience.
Thereisnotasingleorbestsetofsmoothingvalues.Youhavetotestandtweakyourapplicationduring
developmentandusertestingtoseewhatvaluesworkbest.Itislikelythatyourapplicationusesmultiple
smoothingsettingsatdifferentpointsintheapplication’sexecution.
NoteTheSDKusestheHoltDoubleExponentialSmoothingproceduretoreducethejittersfromskeletaljoint
data.Exponentialsmoothingappliestodatageneratedinrelationtotime,whichiscalledtimeseriesdata.
Skeletondataistimeseriesdata,becausetheskeletonenginegeneratesaframeofskeletondataforsome
intervaloftime.1Thissmoothingprocessusesstatisticalanalysistocreateamovingaverage,whichreducesthe
noiseorextremesfromthedataset.Thistypeofdataprocessingwasoriginallyappliedtofinancialmarketand
economicdataforecasting.2
1
Wikipedia,“Exponentialsmoothing,”http://en.wikipedia.org/wiki/Exponential_smoothing, 2011.
PaulGoodwin,“TheHolt-WintersApproachtoExponentialSmoothing:50YearsOldandGoing
Strong,”http://forecasters.org/pdfs/foresight/free/Issue19_goodwin.pdf, Spring 2010.
2
94
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
ChoosingSkeletons
Bydefault,theskeletonengineselectswhichavailableskeletonstoactivelytrack.Theskeletonengine
choosesthefirsttwoskeletonsavailablefortracking,whichisnotalwaysdesirablelargelybecausethe
seletionprocessisunpredicatable.Ifyousochoose,youhavetheoptiontoselectwhichskeletonsto
trackusingtheAppChoosesSkeletonspropertyandChooseSkeletonsmethod.TheAppChoosesSkeletons
propertyisfalsebydefaultandsotheskeletonengineselectsskeletonsfortracking.Tomanuallyselect
whichskeletonstotrack,settheAppChoosesSkeletonspropertytotrueandcalltheChooseSkeletons
methodpassingintheTrackingIDsoftheskeletonsyouwanttotrack.TheChooseSkeletonsmethod
acceptsone,two,ornoTrackingIDs.Theskeletonenginestopstrackingallskeletonswhenthe
ChooseSkeletonsmethodispassednoparameters.Therearesomenuancestoselectingskeletons:
•
AcalltoChooseSkeletonswhenAppChoosesSkeletonsisfalseresultsinan
InvalidOperationExceptionexception.
•
IfAppChoosesSkeletonsissettotruebeforetheSkeletonStreamisenabled,no
skeletonsareactivelytrackeduntilmanuallyselectedbycallingChooseSkeletons.
•
SkeletonsautomaticallyselectedfortrackingbeforesettingAppChoosesSkeletons
issettotruecontinuetobeactivelytrackeduntiltheskeletonleavesthesceneor
ismanuallyreplaced.Iftheautomaticallyselectedskeletonleavesthescene,itis
notautomaticallyreplaced.
•
Anyskeletonsmanuallychosenfortrackingcontinuetobetrackedafter
AppChoosesSkeletonsissettofalseuntiltheskeletonleavesthescene.Itisatthis
pointthattheskeletonengineselectsanotherskeleton,ifanyareavailable.
SkeletonFrame
TheSkeletonStreamproducesSkeletonFrameobjects.Whenusingtheeventmodeltheapplication
retrievesaSkeletonFrameobjectfromeventargumentsbycallingtheOpenSkeletonFramemethod,or
fromtheOpenNextFramemethodontheSkeletonStreamwhenpolling.TheSkeletonFrameobjectholds
skeletondataforamomentintime.Theframe’sskeletondataisavailablebycallingthe
CopySkeletonDataTomethod.Thismethodpopulatesanarraypassedtoitwithskeletondata.The
SkeletonFramehasapropertynamedSkeletonArrayLength,whichgivesthenumberofskeletonsithas
datafor.ThearrayalwaysreturnsfullypopulatedevenwhentherearenousersintheKinect’sviewarea.
MarkingTime
TheFrameNumberandTimestampfieldsmarkthemomentintimeinwhichtheframewasrecorded.
FrameNumberisanintegerthatistheframenumberofthedepthimageusedtogeneratetheskeleton
frame.Theframenumbersarenotalwayssequential,buteachframenumberwillalwaysbegreaterthan
thatofthepreviousframe.Itispossiblefortheskeletonenginetoskipdepthframesduringexecution.
Thereasonsforthisvarybasedonoverallapplicationperformanceandframerate.Forexample,long
runningprocesseswithinanyofthestreameventhandlerscanslowprocessing.Ifanapplicationuses
pollinginsteadoftheeventmodel,itisdependentontheapplicationtodeterminehowfrequentlythe
skeletonenginegeneratesdata,andeffectivelyfromwhichdepthframeskeletondataderives.
TheTimestampfieldisthenumberofmillisecondssincetheKinectSensorwasinitialized.Youdonot
needtoworryaboutlongrunningapplicationsreachingthemaximumFrameNumberorTimestamp.The
FrameNumberisa32-bitintegerwhereastheTimestampisa64-bitinteger.Yourapplicationwouldhaveto
95
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
runcontinuouslyat30framespersecondforjustovertwoandaquarteryearsbeforereachingthe
FrameNumbermaximum,andthiswouldbewaybeforetheTimestampwasclosetoitsceiling.Additionally,
theFrameNumberandTimestampstartoveratzeroeachtimetheKinectSensorisinitialized.Youcanrely
ontheFrameNumberandTimestampvaluestobeunique.
AtthisstageinthelifecycleoftheSDKandoverallKinectdevelopment,thesefieldsareimportantas
theyareusedtoprocessoranalyzeframes,forinstancewhensmoothingjointvalues.Gesture
processingisanotherexample,andthemostcommon,ofusingthisdatatosequenceframedata.The
currentversionoftheSDKdoesnotincludeagestureengine.UntilafutureversionoftheSDKincludes
gesturetracking,developershavetocodetheirowngesturerecognitionalgorithms,whichmaydepend
onknowingthesequenceofskeletonframes.
FrameDescriptors
TheFloorClipPlanefieldisa4-tuple (Tuple<int, int, int, int>) with eachelementisacoefficients
ofthefloorplane.ThegeneralequationforthefloorplaneisAx+By+Cz+D=0,whichmeansthefirst
tupleelementcorrespondstoA,thesecondtoBandsoon.TheDvariableinthefloorplaneequationis
alwaysthenegativeoftheheightofKinectinmetersfromthefloor.Whenpossible,theSDKusesimageprocessingtechniquestodeterminetheexactcoefficientvalues;however,thisisnotalwayspossible,
andthevalueshavetobeestimated.TheFloorClipPlaneisazeroplane(allelementshaveavalueof
zero)whenthefloorisundeterminable.
Skeleton
TheSkeletonclassdefinesasetoffieldstoidentifytheskeleton,describethepositionoftheskeleton
andpossiblythepositionsoftheskeleton’sjoints.Skeletonobjectareavailablebypassinganarrayto
theCopySkeletonDataTomethodonaSkeletonFrame.TheCopySkeletonDataTomethodhasan
unexpectedbehavior,whichmayaffectmemoryusageandreferencestoSkeletonobjects.TheSkeleton
objectsreturnedareuniquetothearrayandnottotheapplication.Takethefollowingcodesnippet:
Skeleton[] skeletonsA = new Skeleton[frame.SkeletonArrayLength];
Skeleton[] skeletonsB = new Skeleton[frame.SkeletonArrayLenght];
frame.CopySkeletonDataTo(skeletonsA);
frame.CopySkeletonDataTo(skeletonsB);
bool resultA = skeletonsA[0] == skeleton[0]; //This is false
bool resultB = skeletonsA[0].TrackingId == skeleton[0].TrackingId;
//This is true
TheSkeletonsobjectsinthearraysarenotthesame.Thedataisthesame,buttherearetwounique
instancesoftheobjects.TheCopySkeletonDataTomethodcreatesanewSkeletonobjectforeachnullslot
inthearray.However,ifthearrayslotisnotnull,itupdatesthedataontheexistingSkeletonobject.
TrackingID
Theskeletontrackingengineassignseachskeletonauniqueidentifier.Thisidentifierisaninteger,
whichincrementallygrowswitheachnewskeleton.Donotexpectthevalueassignedtothenextnew
skeletontogrowsequentially,butratherthenextvaluewillbegreaterthantheprevious.Additionally,
thenextassignedvalueisnotpredictable.Iftheskeletonenginelosestheabilitytotrackauser—for
example,theuserwalksoutofview—thetrackingidentifierforthatskeletonisretired.WhenKinect
detectsanewskeleton,itassignsanewtrackingidentifier.Atrackingidentifierofzeromeansthatthe
96
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Skeletonobjectisnotrepresentingauser,butisjustaplaceholderinthecollection.Thinkofitasanull
skeleton.ApplicationsusetheTrackingIDtospecifywhichskeletonstheskeletonengineshouldactively
track.CalltheChooseSkeletonsmethodontheSkeletonStreamobjecttoinitiatethetrackingofaspecific
skeleton.
TrackingState
Thisfieldprovidesinsightintowhatskeletondataisavailableifanyatall.Table4-2listsallvaluesofthe
SkeletonTrackingStateenumeration.
Table4-2.SkeletonTrackingStateValues
SkeletonTrackingState
What is Means
NotTrackedThe
Skeletonobjectdoesnotrepresentatrackeduser.ThePositionfieldof
theSkeletonandeveryJointinthejointscollectionisazeropoint
(SkeletonPointwheretheX,YandZvaluesallequalzero).
PositionOnly
Theskeletonisdetected,butisnotactivelybeingtracked.ThePositionfield
hasanon-zeropoint,butthepositionofeachJoint inthejointscollectionis
azeropoint.
Tracked
Theskeletonisactivelybeingtracked.ThePositionfieldandallJoint
objectsinthejointscollectionhavenon-zeropoints.
Position
ThePositionfieldisoftypeSkeletonPointandisthecenterofmassoftheskeleton.Thecenterofmass
isroughlythesamepositionasthespinejoint.Thisfieldprovidesafastandsimplemeansof
determiningauser’spositionwhetherornottheuserisactivelybeingtracked.Insomeapplications,this
valueissufficientandthepositionsofspecificjointsareunnecessary.Thisvaluecanalsoserveas
criteriaformanuallyselectingskeletons(SkeletonStream.ChooseSkeletons)totrack.Forexample,an
applicationmaywanttoactivelytrackthetwoskeletonsclosesttoKinect.
ClippedEdges
TheClippedEdgesfielddescribeswhichpartsoftheskeletonisoutoftheKinect’sview.Thisprovidesa
macroinsightintotheskeleton’sposition.Useittoadjusttheelevationangleprogrammaticallyorto
messageuserstorepositionthemselvesinthecenteroftheviewarea.ThepropertyisoftypeFrameEdges,
whichisanenumerationdecoratedwiththeFlagsAttributeattribute.ThismeanstheClippedEdgesfield
couldhaveoneormoreFrameEdgesvalues.RefertoAppendixAformoreinformationaboutworking
withbitfieldssuchasthisone.Table4-3liststhepossibleFrameEdgesvalues.
97
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Table4-3.FrameEdgesValues
FrameEdges
What is Means
Bottom
TheuserhasoneormorebodypartsbelowKinect’sfieldofview.
Left
TheuserhasoneormorebodypartsoffKinect’sleft.
Right
TheuserhasoneormorebodypartsoffKinect’sright.
Top
TheuserhasoneormorebodypartsaboveKinect’sfieldofview.
None
TheuseriscompletelyinviewoftheKinect.
Itispossibletoimprovethequalityofskeletondataifanypartoftheuser’sbodyisoutoftheview
area.Theeasiestsolutionistopresenttheuserwithamessageaskingthemtoadjusttheirpositionuntil
theclippingiseitherresolvedorinanacceptablestate.Forexample,anapplicationmaynotbe
concernedthattheuserisbottomclipped,butmessagestheuseriftheybecomeclippedontheleftor
right.TheothersolutionistophysicallyadjustthetitleoftheKinect.Kinecthasabuilt-inmotorthat
titlesthecameraheadupanddown.Theangleofthetileisadjustablechangingthevalueofthe
ElevationAnglepropertyontheKinectSensorobject.Ifanapplicationismoreconcernedwiththefeet
jointsoftheskeleton,itneedstoensuretheuserisnotbottomclipped.Adjustingthetitleangleofthe
sensorhelpskeeptheuser’sbottomjointsinview.
TheElevationAngleismeasuredindegrees.TheKinectSensorobjectpropertiesMinElevationAngle
andMaxElevationAngledefinethevaluerange.Anyattempttosettheanglevalueoutsideofthesevalues
resultsinanArgumentOutOfRangeException exception.Microsoftwarnsnottochangethetitleangle
repeatedlyasitmaywearoutthetiltmotor.Tohelpsavedevelopersfrommistakesandtohelppreserve
themotor,theSDKlimitsthenumberofvaluechangestoonepersecond.Further,itenforcesa20secondbreakafter15consecutivechanges,inthattheSDKwillnothonoranyvaluechangesfor20
secondsfollowingthe15thchange.
Joints
EachSkeletonobjecthasapropertynamedJoints.ThispropertyisoftypeJointsCollectionand
containsasetofJointstructuresthatdescribethetrackablejoints(head,hands,elbowandothers)ofa
skeleton.AnapplicationreferencesspecificjointsbyusingtheindexerontheJointsCollectionwhere
theidentifierisavaluefromtheJointTypeenumeration.TheJointsCollectionisalwaysfullypopulated
andreturnsaJointstructureforanyJointTypeevenwhentherearenouser’sinview.
Joint
Theskeletontrackingenginefollowsandreportsontwentypointsorjointsoneachuser.TheJoint
structurerepresentsthetrackingdatawiththreeproperties.TheJointTypepropertyoftheJointisa
valuefromtheJointTypeenumeration.Figure4-3illustratesalltrackablejoints.
98
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Figure4-3.Illustratedskeletonjoints
EachjointhasaPosition,whichisoftypeSkeletonPointthatreportstheX,Y,andZofthejoint.The
XandYvaluesarerelativetotheskeletonspace,whichisnotthesameasthedepthorvideospace.The
KinectSensorhasasetofmethods(describedlaterintheSpaceandTransformssection)thatconvert
skeletonpointstodepthpoints.Finally,thereistheJointTrackingStateproperty,whichdescribesifthe
jointisbeingtrackedandhow.Table4-4liststhedifferenttrackingstates.
99
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Table4-4.JointTrackingStateValues
JointTrackingState
What it means
Inferred
Theskeletonenginecannotseethejointinthedepthframepixels,buthas
madeacalculateddeterminationofthepositionofthejoint.
NotTracked
Thepositionofthejointisindeterminable.ThePositionvalueisazeropoint.
Tracked
Thejointisdetectedandactivelyfollowed.
KinecttheDots
Goingthroughbasicexercisesthatillustratepartsofalargerconceptisonething.Buildingfully
functional,usableapplicationsisanother.Wediveintousingtheskeletonenginebybuildingagame
calledKinecttheDots.Everychildgrowsupwithcoloringbooksandconnect-the-dotsdrawingbooks.A
childtakesacrayonanddrawsalinefromonedottoanotherinaspecificsequence.Anumbernextto
eachdotdefinesthesequence.Wewillbuildthisgame,butinsteadofusingacrayon,thechildren(inall
ofus)usetheirhands.
Thisobviouslyisnotanaction-packed,first-personshooterorMMOthatwillconsumethetech
hipstersorhardcorereadersofSlashdotorTechCrunch,butitisperfectforourpurposes.Wewantto
createarealapplicationwiththeskeletonenginethatusesjointdataforsomeotherusebesides
renderingtotheUI.ThisgamepresentsopportunitiestointroduceNaturalUserInterface(NUI)design
concepts,andameansofdevelopingacommonKinectuserinterface:handtracking.KinecttheDotsis
anapplicationwecanbuildwithnoproductionassets(images,animations,hi-fidelitydesigns),but
insteadusingonlycoreWPFdrawingtools.However,withalittleproductioneffortonyourpart,itcan
beafullypolishedapplication.And,yes,whileyouandImaynotderivegreatentertainmentfromit,
grabason,daughter,orlittlebrother,sister,niece,nephew,andwatchhowmuchfuntheyhave.
Beforecodinganyproject,weneedtodefineourfeatureset.KinecttheDotsisapuzzlegamewhere
auserdrawsanimagebyfollowingasequenceofdots.Immediately,wecanidentifyanentityobject
puzzle.Eachpuzzleconsistsofaseriesofdotsorpoints.Theorderofthepointsdefinesthesequence.
WewillcreateaclassnamedDotPuzzlethathasacollectionofPointobjects.Initially,itmightseem
unnecessarytohavetheclass(whycan’twejusthaveanarraymembervariableofpoint?),butlaterit
willmakeaddingnewfeatureseasy.Theapplicationusesthepuzzlepointsintwoways,thefirstbeingto
drawthedotsonthescreen.Thesecondistodetectwhenausermakescontactwithadot.
Whenausermakescontactwithadot,theapplicationbeginsdrawingaline,withthestartingpoint
ofthelineanchoredatthedot.Theend-pointofthelinefollowstheuser’shanduntilthehandmakes
contactwiththenextdotinthesequence.Thenextsequentialdotbecomestheanchorfortheendpoint,andanewlinestarts.Thiscontinuesuntiltheuserdrawsalinefromthelastpointbacktothefirst
point.Thepuzzleisthencompleteandthegameisover.
Withtherulesofthegamedefined,wearereadytocode.Asweprogressthroughtheproject,wewill
findnewideasforfeaturesandaddthemasneeded.StartbycreatinganewWPFprojectandreferencing
theSDKlibraryMicrosoft.Kinect.dll.Addcodetodetectandinitializeasinglesensor.Thisshould
includesubscribingtotheSkeletonFrameReadyevent.
100
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
TheUserInterface
ThecodeinListing4-4istheXAMLfortheprojectandthereareacoupleofimportantobservationsto
note.ThePolylineelementrendersthelinedrawnfromdottodot.Astheusermoveshishandfromdot
todot,theapplicationaddspointstotheline.ThePuzzleBoardElementCanvas elementholdstheUI
elementsforthedots.TheorderoftheUIelementsintheLayoutRootGridisintentional.Weusea
layeredapproachsothatourhandcursor,representedbytheImageelement,isalwaysinfrontofthe
dotsandlines.TheotherreasonforputtingtheseUIelementsintheirowncontaineristhatresettingthe
currentpuzzleorstartinganewpuzzleiseasy.Allwehavetodoisclearthechildelementsofthe
PuzzleBoardElement andthe CrayonElementandtheotherUIelementsareunaffected.
TheViewboxandGridelementsarecriticaltotheUIlookingastheuserexpectsitto.Weknowthat
thevaluesofeachskeletaljointarebasedinskeletonspace.Thismeansthatwehavetotranslatethe
jointvectorstobeinourUIspace.Forthisproject,wewillhard-codetheUIspaceandnotallowitto
floatbasedonthesizeoftheUIwindow.TheGridelementdefinestheUIspaceas1920x1200.Weare
usingtheexactdimensionsof1920x1200,becausethatisacommonfullscreensize;also,itis
proportionaltothedepthimagesizesofKinect.Thismakescoordinatesystemtransformsclearerand
providesforsmoothercursormovements.
Listing4-4.XAMLForKinectTheDots
<Window x:Class="Chapter5KinectTheDots.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="MainWindow" Height="600" Width="800" Background="White">
<Viewbox>
<Grid x:Name="LayoutRoot" Width="1920" Height="1200">
<Polyline x:Name="CrayonElement" Stroke="Black" StrokeThickness="3"/>
<Canvas x:Name="PuzzleBoardElement"/>
<Canvas x:Name="GameBoardElement">
<Image x:Name="HandCursorElement" Source="Images/hand.png"
Width="75" Height="75" RenderTransformOrigin="0.5,0.5">
<Image.RenderTransform>
<TransformGroup>
<ScaleTransform x:Name="HandCursorScale" ScaleX="1"/>
</TransformGroup>
</Image.RenderTransform>
</Image>
</Canvas>
</Grid>
</Viewbox></Window>
Havingahard-codedUIspacealsomakesiteasieronus,thedeveloper.Wewanttomakethe
processoftranslatingfromskeletonspacetoourUIspacequickandeasywithasfewlinesofcodeas
possible.Further,reactingtowindowsizechangesaddsmoreworkforusthatisnotrelevanttoourmain
task.WecanbelazyandletWPFdothescalingworkforusbywrappingtheGridinaViewboxcontrol.
TheViewboxscalesitschildrenbasedontheirsizeinrelationtotheavailablesizeofthewindow.
ThefinalUIelementtopointoutistheImage.Thiselementisthehandcursor.Inthisproject,we
useasimpleimageofahand,butyoucanfindyourownimage(itdoesn’thavetobeshapedlikeahand)
oryoucancreatesomeotherUIelementforthecursorsuchasanEllipse.Theimageinthisexampleis
arighthand.Inthecodethatfollows,wegivetheusertheoptionofusinghisleftorrighthand.Ifthe
101
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
usermotionswithhislefthand,wefliptheimagesothatitlookslikealefthandusingthe
ScaleTransform.TheScaleTransformhelpstomakethegraphiclookandfeelright.
HandTracking
Peopleinteractwiththeirhands,soknowingwhereandwhatthehandsaredoingisparamounttoa
successfulandengagingKinectapplication.Thelocationandmovementsofthehandsisthebasisfor
virtuallyallgestures.Trackingthemovementsofthehandsisthemostcommonuseofthedatareturned
byKinect.Thisiscertainlythecasewithourapplicationasweignoreallotherjoints.
Whendrawinginaconnect-the-dotsbook,apersonnormallydrawswithapencil,orcrayon,using
asinglehand.Onehandcontrolsthecrayontodrawlinesfromonedottoanother.Ourapplication
replicatesasinglehanddrawingwithcrayononapaperinterface.Thisuserinterfaceisnaturaland
alreadyknowntousers.Further,itrequireslittleinstructiontoplayandtheuserquicklybecomes
immersedintheexperience.Asaresult,theapplicationinherentlybecomesmoreenjoyable.Itiscrucial
tothesuccessofanyKinectapplicationthattheapplicationbeasintuitiveandnon-invasivetothe
user’snaturalformofinteractionaspossible.Bestofall,itrequiresminimalcodingeffortonourpart.
UsersnaturallyextendorreachtheirarmstowardsKinect.Withinthisapplication,whicheverhand
isclosesttoKinect,farthestfromtheuser,becomesthedrawingorprimaryhand.Theuserhasthe
optionofswitchinghandsatanytimeinthegame.Thisallowsbothleftiesandrightiestoplaythegame
comfortably.Codingtheapplicationtothesefeaturescreatesthecrayon-on-paperanalogyandsatisfies
ourgoalofcreatinganaturaluserinterface.ThecodestartswithListing4-5.
Listing4-5.SkeletonFrameReadyEventHandler
private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
using(SkeletonFrame frame = e.OpenSkeletonFrame())
{
if(frame != null)
{
frame.CopySkeletonDataTo(this._FrameSkeletons);
Skeleton skeleton = GetPrimarySkeleton(this._FrameSkeletons);
}
}
if(skeleton == null)
{
HandCursorElement.Visibility = Visibility.Collapsed;
}
else
{
Joint primaryHand = GetPrimaryHand(skeleton);
TrackHand(primaryHand);
}
}
private static Skeleton GetPrimarySkeleton(Skeleton[] skeletons)
{
Skeleton skeleton = null;
102
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
if(skeletons != null)
{
//Find the closest skeleton
for(int i = 0; i < skeletons.Length; i++)
{
if(skeletons[i].TrackingState == SkeletonTrackingState.Tracked)
{
if(skeleton == null)
{
skeleton = skeletons[i];
}
else
{
if(skeleton.Position.Z > skeletons[i].Position.Z)
{
skeleton = skeletons[i];
}
}
}
}
}
}
return skeleton;
Eachtimetheeventhandlerexecutes,wefindthefirstvalidskeleton.Theapplicationdoesnotlock
inonasingleskeleton,becauseitdoesnotneedtotrackorfollowasingleuser.Iftherearetwovisible
users,theuserclosesttoKinectbecomestheprimaryuser.Thisisthefunctionofthe
GetPrimarySkeletonmethod.Iftherearenodetectableusers,thentheapplicationhidesthehand
cursors;otherwise,wefindtheprimaryhandandupdatethehandcursor.Thecodeforfindingthe
primaryhandisinListing4-6.
TheprimaryhandisalwaysthehandclosesttoKinect.However,thecodeisnotassimpleas
checkingtheZvalueoftheleftandrighthandsandtakingthelowervalue.RememberthataZvalueof
zeromeansthedepthvalueisindeterminable.Becauseofthis,wehavetodomorevalidationonthe
joints.CheckingtheTrackingStateofeachjointtellsustheconditionsunderwhichthepositiondata
wascalculated.Thelefthandisthedefaultprimaryhand,fornootherreasonthanthattheauthorislefthanded.Therighthandthenhastobetrackedexplicitly(JointTrackingState.Tracked)orimplicitly
(JointTrackingState.Inferred)forustoconsideritasareplacementofthelefthand.
TipWhenworkingwithJointdata,alwayschecktheTrackingState.Notdoingsooftenleadstounexpected
positionvalues,amisbehavingUI,orexceptions.
103
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Listing4-6.GettingthePrimaryHandandUpdatingtheCursorPosition
private static Joint GetPrimaryHand(Skeleton skeleton)
{
Joint primaryHand = new Joint();
if(skeleton != null)
{
primaryHand
= skeleton.Joints[JointType.HandLeft];
Joint righHand = skeleton.Joints[JointType.HandRight];
}
}
if(righHand.TrackingState != JointTrackingState.NotTracked)
{
if(primaryHand.TrackingState == JointTrackingState.NotTracked)
{
primaryHand = righHand;
}
else
{
if(primaryHand.Position.Z > righHand.Position.Z)
{
primaryHand = righHand;
}
}
}
return primaryHand;
Withtheprimaryhandknown,thenextactionistoupdatethepositionofthehandcursor(see
Listing4-7).Ifthehandisnottracked,thenthecursorishidden.Inmoreprofessionallyfinished
applications,hidingthecursormightbedonewithaniceanimation,suchasafadeorzoomout,that
hidesthecursor.Forthisproject,itsufficestosimplysettheVisibilitypropertyto
Visibility.Collapsed.Whentrackingahand,weensurethecursorisvisible,calculatetheX,Yposition
ofthehandinourUIspace,updateitsscreenposition,andsettheScaleTransform(HandCursorScale)
basedonthehandbeingleftorright.Thecalculationtodeterminethepositionofthecursoris
interesting,andrequiresfurtherexamination.Thiscodeissimilartocodewewroteinthestickfigure
example(Listing4-3).Wecovertransformationslater,butfornowjustknowthattheskeletondataisin
anothercoordinatespacethantheUIelementsandweneedtoconvertthepositionvaluesfromone
coordinatespacetoanother.
Listing4-7.UpdatingthePositionoftheHandCursor
private void TrackHand(Joint hand)
{
if(hand.TrackingState == JointTrackingState.NotTracked)
{
HandCursorElement.Visibility = System.Windows.Visibility.Collapsed;
}
104
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
else
{
HandCursorElement.Visibility = System.Windows.Visibility.Visible;
float x;
float y;
DepthImagePoint point = this.KinectDevice.MapSkeletonPointToDepth(hand.Position,
DepthImageFormat.Resolution640x480Fps30);
point.X = (int) ((point.X * LayoutRoot.ActualWidth /
this.KinectDevice.DepthStream.FrameWidth) (HandCursorElement.ActualWidth / 2.0));
point.Y = (int) ((point.Y * LayoutRoot.ActualHeight) /
this.KinectDevice.DepthStream.FrameHeight) (HandCursorElement.ActualHeight / 2.0));
Canvas.SetLeft(HandCursorElement, x);
Canvas.SetTop(HandCursorElement, y);
}
if(hand.ID == JointType.HandRight)
{
HandCursorScale.ScaleX = 1;
}
else
{
HandCursorScale.ScaleX = -1;
}
}
Atthispoint,compileandruntheapplication.Theapplicationproducesoutputsimilartothat
showninFigure4-4,whichshowsaskeletonstickfiguretobetterillustratethehandmovements.With
handtrackinginplaceandfunctional,wemovetothenextphaseoftheprojecttobegingameplay
implementation.
Figure4-4.Handstrackingwithstickfigurewavinglefthand
105
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
DrawingthePuzzle
Listing4-8showstheDotPuzzleclass.Itissimpletothepointthatyoumayquestionitsneed,butit
servesasabasisforlaterexpansion.Theprimaryfunctionofthisclassistoholdacollectionofpoints
thatcomposethepuzzle.ThepositionofeachpointintheDotscollectiondeterminestheirsequencein
thepuzzle.Thecompositionoftheclasslendsitselftoserialization.Becauseitiseasilyserializable,we
couldexpandourapplicationtoreadpuzzlesfromXMLfiles.
Listing4-8.DotPuzzleClass
public class DotPuzzle
{
public DotPuzzle()
{
this.Dots = new List<Point>();
}
}
public List<Point> Dots { get; set; }
WiththeUIlaidoutandourprimaryentityobjectdefined,wemoveontocreatingourfirstpuzzle.
IntheconstructoroftheMainWindowclass,createanewinstanceoftheDotPuzzleclassanddefinesome
points.ThecodeinboldinListing4-9showshowthisisdone.Thecodelistingalsoshowsthemember
variableswewilluseforthisapplication.ThevariablePuzzleDotIndexisusedtotracktheuser’sprogress
insolvingthepuzzle.Weinitiallysetthe_PuzzleDotIndexvariableto-1toindicatethattheuserhasnot
startedthepuzzle.
Listing4-9.MainWindowMemberVariablesandConstructor
private DotPuzzle _Puzzle;
private int _PuzzleDotIndex;
public MainWindow()
{
InitializeComponent();
//Sample puzzle
this._Puzzle = new DotPuzzle();
this._Puzzle.Dots.Add(new Point(200, 300));
this._Puzzle.Dots.Add(new Point(1600, 300));
this._Puzzle.Dots.Add(new Point(1650, 400));
this._Puzzle.Dots.Add(new Point(1600, 500));
this._Puzzle.Dots.Add(new Point(1000, 500));
this._Puzzle.Dots.Add(new Point(1000, 600));
this._Puzzle.Dots.Add(new Point(1200, 700));
this._Puzzle.Dots.Add(new Point(1150, 800));
this._Puzzle.Dots.Add(new Point(750, 800));
this._Puzzle.Dots.Add(new Point(700, 700));
this._Puzzle.Dots.Add(new Point(900, 600));
this._Puzzle.Dots.Add(new Point(900, 500));
106
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
this._Puzzle.Dots.Add(new Point(200, 500));
this._Puzzle.Dots.Add(new Point(150, 400));
this._PuzzleDotIndex = -1;
this.Loaded
+= MainWindow_Loaded;
}
ThelaststepincompletingtheUIistodrawthepuzzlepoints.ThiscodeisinListing4-10.We
createamethodnamedDrawPuzzle,whichwecallwhentheMainWindowloads(MainWindow_Loadedevent
handler).TheDrawPuzzleiteratesovereachdotinthepuzzle,creatingUIelementstorepresentthedot.
ItthenplacesthatdotinthePuzzleBoardElement.AnalternativetobuildingtheUIincodeistobuilditin
theXAML.WecouldhaveattachedtheDotPuzzleobjecttotheItemsSourcepropertyofanItemsControl
object.TheItemsControl’sItemTemplatepropertywouldthendefinethelookandplacementofeachdot.
Thisdesignismoreelegant,becauseitallowsforthemingoftheuserinterface.Thedesign
demonstratedinthesepageswaschosentokeepthefocusontheKinectcodeandnotthegeneralWPF
code,aswellasreducethenumberoflinesofcodeinprint.Youarehighlyencouragedtorefactorthe
codetoleverageWPF’spowerfuldatabindingandstylingsystemsandtheItemsControl.
107
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Listing4-10.DrawingthePuzzle
private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
KinectSensor.KinectSensors.StatusChanged += KinectSensors_StatusChanged;
this.KinectDevice = KinectSensor.KinectSensors.FirstOrDefault(x => x.Status ==
KinectStatus.Connected);
}
DrawPuzzle(this._Puzzle);
private void DrawPuzzle(DotPuzzle puzzle)
{
PuzzleBoardElement.Children.Clear();
if(puzzle != null)
{
for(int i = 0; i < puzzle.Dots.Count; i++)
{
Grid dotContainer
= new Grid();
dotContainer.Width = 50;
dotContainer.Height = 50;
dotContainer.Children.Add(new Ellipse() { Fill = Brushes.Gray });
TextBlock dotLabel
= new TextBlock();
dotLabel.Text
= (i + 1).ToString();
dotLabel.Foreground
= Brushes.White;
dotLabel.FontSize
= 24;
dotLabel.HorizontalAlignment = System.Windows.HorizontalAlignment.Center;
dotLabel.VerticalAlignment
= System.Windows.VerticalAlignment.Center;
dotContainer.Children.Add(dotLabel);
}
}
//Position the UI element centered on the dot point
Canvas.SetTop(dotContainer, puzzle.Dots[i].Y - (dotContainer.Height / 2) );
Canvas.SetLeft(dotContainer, puzzle.Dots[i].X - (dotContainer.Width / 2));
PuzzleBoardElement.Children.Add(dotContainer);
}
SolvingthePuzzle
Uptothispoint,wehavebuiltauserinterfaceandcreatedabaseinfrastructureforpuzzledata;weare
visuallytrackingauser’shandmovementsanddrawingpuzzlesbasedonthatdata.Thefinalcodeto
adddrawslinesfromonedottoanother.Whentheusermovesherhandoveradot,weestablishthatdot
asananchorfortheline.Theline’send-pointiswherevertheuser’shandis.Astheusermovesherhand
aroundthescreen,thelinefollows.ThecodeforthisfunctionalityisintheTrackPuzzlemethodshown
inListing4-11.
108
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
ThemajorityofthecodeinthisblockisdedicatedtodrawinglinesontheUI.Theotherparts
enforcetherulesofthegame,suchasfollowingthecorrectsequenceofthedots.However,onesection
ofcodedoesneitherofthesethings.Itsfunctionistomaketheapplicationmoreuser-friendly.Thecode
calculatesthelengthdifferencebetweenthenextdotinthesequenceandthehandposition,andchecks
toseeifthatdistanceislessthan25pixels.Thenumber25isarbitrary,butitisagoodsolidnumberfor
ourUI.Kinectneverreportscompletelysmoothjointpositionsevenwithsmoothingparameters
applied.Additionally,usersrarelyhaveasteadyhand.Therefore,itisimportantforapplicationstohave
ahitzonelargerthantheactualUIelementtarget.Thisisadesignprinciplecommonintouchinterfaces
andappliestoKinectaswell.Iftheusercomesclosetothehitzone,wegivehercreditforhittingthe
target.
NoteThecalculationstogettheresultsstoredinthepointdotDiffandlengthareexamplesofvectormath.
ThistypeofmathcanbequitecommoninKinectapplications.Grabanoldgradeschoolmathbookorusethebuilt
invectorroutinesof.NET.
Finally,addonelineofcodetotheSkeletonFrameReadyeventhandlertocalltheTrackPuzzle
method.ThecallfitsperfectlyrightafterthecalltoTrackHandinListing4-5.Addingthiscodecompletes
theapplication.Compile.Run.Kinectthedots!
109
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Listing4-11.DrawingtheLinestoKinecttheDots
using Nui=Microsoft.Kinect;
private void TrackPuzzle(SkeletonPoint position)
{
if(this._PuzzleDotIndex == this._Puzzle.Dots.Count)
{
//Do nothing - Game is over
}
else
{
Point dot;
if(this._PuzzleDotIndex + 1 < this._Puzzle.Dots.Count)
{
dot = this._Puzzle.Dots[this._PuzzleDotIndex + 1];
}
else
{
dot = this._Puzzle.Dots[0];
}
float x;
float y;
DepthImagePoint point = this.KinectDevice.MapSkeletonPointToDepth(position,
DepthImageFormat.Resolution640x480Fps30);
point.X = (int) (point.X * LayoutRoot.ActualWidth /
this.KinectDevice.DepthStream.FrameWidth);
point.Y = (int) (point.Y * LayoutRoot.ActualHeight /
this.KinectDevice.DepthStream.FrameHeight);
Point handPoint = new Point(point.X, point.Y);
//Calculate the length between the two points. This can be done manually
//as shown here or by using the System.Windows.Vector object to get the length.
//System.Windows.Media.Media3D.Vector3D is available for 3D vector math.
Point dotDiff = new Point(dot.X - handPoint.X, dot.Y - handPoint.Y);
double length = Math.Sqrt(dotDiff.X * dotDiff.X + dotDiff.Y * dotDiff.Y);
int lastPoint = this.CrayonElement.Points.Count – 1;
if(length < 25)
{
//Cursor is within the hit zone
if(lastPoint > 0)
{
//Remove the working end point
this.CrayonElement.Points.RemoveAt(lastPoint);
110
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
}
//Set line end point
this.CrayonElement.Points.Add(new Point(dot.X, dot.Y));
//Set new line start point
this.CrayonElement.Points.Add(new Point(dot.X, dot.Y));
//Move to the next dot
this._PuzzleDotIndex++;
if(this._PuzzleDotIndex == this._Puzzle.Dots.Count)
{
//Notify the user that the game is over
}
}
else
{
if(lastPoint > 0)
{
//To refresh the Polyline visual you must remove the last point,
//update and add it back.
Point lineEndpoint = this.CrayonElement.Points[lastPoint];
this.CrayonElement.Points.RemoveAt(lastPoint);
lineEndpoint.X = handPoint.X;
lineEndpoint.Y = handPoint.Y;
this.CrayonElement.Points.Add(lineEndpoint);
}
}
}
}
SuccessfullybuildingandrunningtheapplicationyieldsresultslikethatofFigure4-5.Thehand
cursortrackstheuser’smovementsandbeginsdrawingconnectinglinesaftermakingcontactwiththe
nextdotinthesequence.Theapplicationvalidatestoensurethedotconnectionsaremadeinsequence.
Forexample,iftheuserinFigure4-5movestheirhandtodot11theapplicationdoesnotcreatea
connection.Thisconcludestheprojectandsuccessfullydemonstratesbasicskeletonprocessinginafun
way.ReadthenextsectionforideasonexpandingKinecttheDotstotakeitbeyondasimplewalkthrough
levelproject.
111
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Figure4-5.KinecttheDotsinaction
ExpandingtheGame
KinecttheDotsisfunctionallycomplete.Ausercanstarttheapplicationandmovehishandsaroundto
solvethepuzzle.However,itisfarfromapolishedapplication.Itneedssomefit-and-finish.Themost
obvioustweakistoaddsmoothing.Youshouldhavenoticedthatthehandcursorisjumpy.Thesecond
mostobviousfeaturetoaddisawaytoresetthepuzzle.Asitstands,oncetheusersolvesthepuzzle
thereisnothingmoretodobutkilltheapplicationandthat’snofun!Yourusersarechanting“More!
More!More!”
Oneoptionistocreateahotspotintheupperleftcornerandlabelit“Reset.”Whentheuser’shand
entersthisarea,theapplicationresetsthepuzzlebysetting_PuzzleDotIndexto-1andclearingthe
pointsfromtheCrayonElement.ItwouldbesmarttocreateaprivatemethodnamedResetPuzzlethat
doesthiswork.Thismakestheresetcodemorereusable.
Herearemorefeaturesyouarehighlyencouragedtoaddtothegametomakeitacomplete
experience:
•
Createmorepuzzles!Maketheapplicationsmartersothatwhenitinitiallyloadsit
readsacollectionofpuzzlesfromanXMLfile.Thenrandomlypresenttheuser
withapuzzle.Or…
112
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
•
Givetheusertheoptiontoselectwhichpuzzleshewantstosolve.Theuserselects
apuzzleanditdraws.Atthispoint,thisisanadvancedfeaturetoaddtothe
application.Itrequirestheusertoselectanoptionfromalist.Ifyouare
ambitious,goforit!Afterreadingthechapterongestures,thiswillbeeasy.Aquick
solutionistobuildthemenusothatitworkswithtouchoramouse.Kinectand
touchworkverywelltogether.
•
Advancetheusertoanewpuzzleonceshehascompletedthecurrentpuzzle.
•
Addextradata,suchasatitleandbackgroundimage,toeachpuzzle.Displaythe
titleatthetopofthescreen.Forexample,ifthepuzzleisafish,thebackground
canbeanunderwaterscenewithaseafloor,astarfish,mermaids,andotherfish.
ThebackgroundimagewouldbedefinedasapropertyontheDotPuzzleobject.
•
Addautomateduserassistancetohelpusersstrugglingtofindthenextdot.Inthe
code,startatimerwhentheuserconnectswithadot.Eachtimetheuserconnects
withadot,resetthetimer.Ifthetimergoesoff,itmeanstheuserishavingtrouble
findingthenextdot.Atthispoint,theapplicationdisplaysapop-upmessage
pointingtothenextdot.
•
Resetthepuzzlewhenauserleavesthegame.Supposeauserhastoleavethe
gameforsomereasontoanswerthephone,getadrinkofwater,orgofora
bathroombreak.WhenKinectnolongerdetectsanyusers,startatimer.Whenthe
timerexpires,resetthepuzzle.Youdidremembertoputtheresetcodeinitsown
methodsothatitcanbecalledfrommultipleplaces,right?
•
Rewardtheuserforcompletingapuzzlebyplayinganexcitinganimation.You
couldevenhaveananimationeachtimetheusersuccessfullyconnectsadot.The
dot-for-dotanimationsshouldbesubtleyetrewarding.Youdonotwantthemto
annoytheuser.
•
Aftersolvingapuzzlelettheusercolorthescreen.Givehimtheoptionofselecting
acolorfromametaphoriccolorbox.Afterselectingacolor,whereverhishand
goesonthescreentheapplicationdrawscolor.
SpaceandTransforms
Ineachoftheexampleprojectsinthischapter,weprocessedandmanipulatedthePositionpointof
Joints.Inalmostallcircumstances,thisdataisunusablewhenraw.Skeletonpointsaremeasured
differentlyfromdepthorvideodata.Eachindividualsetofdata(skeleton,depth,andvideo)isdefined
withinaspecificgeometriccoordinateplaneorspace.Depthandvideospacearemeasuredinpixels,
withthezeroXandYpositionsbeingattheupperleftcorner.TheZdimensioninthedepthspaceis
measuredinmillimeters.However,theskeletonspaceismeasuredinmeterswiththezeroXandY
positionsatthecenterofthedepthsensor.Skeletonspaceusesaright-handedcoordinatespacewhere
thepositivevaluesoftheX-axisextendtotherightandthepositiveY-axisextendsupward.TheX-axis
rangesfrom-2.2to2.2(7.22”)foratotalspanof4.2metersor14.44feet;theY-axisrangesfrom-1.6to1.6
(5.25”);andtheZ-axisfrom0to4(13.1233”).Figure4-6illustratesthecoordinatespaceoftheskeleton
stream.
113
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Figure4-6.Skeletonspace
SpaceTransformations
Kinectexperiencesareallabouttheuserinteractingwithavirtualspace.Themoreinteractionsan
applicationmakespossible,themoreengagingandentertainingtheexperience.Wewantuserstoswing
virtualrackets,throwvirtualbowlingballs,pushbuttons,andswipethroughmenus.Inthecaseof
KinecttheDots,wewanttheusertomoveherhandwithinrangeofadot.Forustoknowthatauseris
connectingonedottoanother,wehavetoknowthattheuser’shandisoveradot.Thisdeterminationis
onlypossiblebytransformingtheskeletondatatoourvisualUIspace.SincetheSDKdoesnotgive
skeletondatatousinaformdirectlyusablewithouruserinterfaceandvisualelements,wehavetodo
somework.
Convertingortransformingskeletonspacevaluestodepthspaceiseasy.TheSDKprovidesacouple
ofhelpermethodstotransformbetweenthetwospaces.TheKinectSensorobjecthasamethodnamed
MapSkeletonPointToDepthtotransformskeletonpointsintopointsusableforUItransformations.There
isalsoamethodnamedMapDepthToSkeletonPoint,whichperformstheconversioninreverse.The
MapSkeletonPointToDepthmethodtakesaSkeletonPointandaDepthImageFormat.Theskeletonpointcan
comefromthePositionpropertyontheSkeletonorthePositionpropertyofaJointontheskeleton.
Whilethenameofthemethodhastheword“Depth”init,thisisnottobetakenliterally.The
destinationspacedoesnothavetobeaKinectdepthimage.Infact,theDepthStreamdoesnothavetobe
enabled.ThetransformistoanyspacialplaneofthedimensionssupportedbytheDepthImageFormat.
Oncetheskeletonpointismappedtothedepthspaceitcanbescaledtoanydesireddimension.
Inthestickfigureexercise,theGetJointPointmethod(Listing4-3)transformseachskeletonpoint
toapixelinthespaceoftheLayoutRootelement,becausethisisthespaceinwhichwewanttodraw
skeletonjoints.IntheKinecttheDotsproject,weperformthistransformtwice.Thefirstisinthe
TrackHandmethod(Listing4-7).Inthisinstance,wecalculatedthetransformandthenadjustedthe
positionsothatthehandcursoriscenteredatthatpoint.TheotherinstanceisintheTrackPuzzle
method(Listing4-11),whichdrawsthelinesbasedontheuser’shandmovements.Herethecalculation
issimpleandjusttransformstotheLayoutRootelement’sspace.Bothcalculationsarethesameinthat
theytransformtothesameUIelement’sspace.
114
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
TipCreatehelpermethodstoperformspacetransforms.Therearealwaysfivevariablestoconsider:source
vector,destinationwidth,destinationheight,destinationwidthoffset,anddestinationheightoffset.
LookingintheMirror
Asyoumayhavenoticedfromtheprojects,theskeletondataismirrored.Undermostcircumstances,
thisisacceptable.InKinecttheDots,itworksbecauseusersexpectthecursortomimictheirhand
movementsexactly.Thisalsoworksforaugmentedrealityexperienceswheretheuserinterfaceisbased
onthevideoimageandtheuserexpectsamirroredeffect.Manygamesrepresenttheuserwithanavatar
wheretheavatar’sbackfacestheuser.Thisisathird-personperspectiveview.However,thereare
instanceswherethemirroreddataisnotconducivetotheUIpresentation.Applicationsorgamesthat
haveafront-facingavatar,wheretheuserseesthefaceoftheavatar,donotwantthemirroredaffect.
Whentheuserwaveshisleftarm,theleftoftheavatarshouldwave.Withoutmakingasmall
manipulationtotheskeletondata,theavatar’srightarmwaves,whichisclearlyincorrect.
Unfortunately,theSDKdoesnothaveanoptionorpropertytoset,whichcausestheskeleton
enginetoproducenon-mirroreddata.Thisworkistheresponsibilityofthedeveloper,butluckily,itisa
trivialoperationduetothenatureofskeletondata.Thenon-mirroredeffectworksbyinvertingtheX
valueofeachskeletonpositionvector.Tocalculatetheinverseofanyvectorcomponent,multiplyitby
-1.ExperimentwiththestickfigureprojectbyupdatingtheGetJointPointmethod(originallyshownin
Listing4-3)asshowninListing4-12.Withthischangeinplace,whentheuserraiseshisleftarm,thearm
ontherightsideofthestickfigurewillraise.
Listing4-12.ReversingtheMirror
private Point GetJointPoint(Joint joint)
{
DepthImagePoint point = this.KinectDevice.MapSkeletonPointToDepth(joint.Position,
DepthImageFormat.Resolution640x480Fps30);
depthX *= -1 * (int) this.LayoutRoot.ActualWidth /
KinectDevice.DepthStream.FrameWidth;
depthY *= (int) this.LayoutRoot.ActualHeight /
KinectDevice.DepthStream.FrameHeight;
}
return new Point((double) depthX, (double) depthY);
SkeletonViewerUserControl
AsyouworkwithKinectandtheKinectforWindowsSDKtobuildinteractiveexperiences,youfindthat
duringdevelopmentitisveryhelpfultoactuallyseeavisualrepresentationofskeletonandjointdata.
Whendebugginganapplication,itishelpfultoseeandunderstandtherawinputdata,butinthe
productionversionoftheapplication,youdonotwanttoseethisinformation.Oneoptionistotakethe
codewewroteintheskeletonviewerexerciseandcopyandpasteitintoeachapplication.Afterawhile,
thisbecomestediousandcluttersyourcodebaseunnecessarily.Forthesereasonsandothers,itis
helpfultorefactorthiscodesothatitisreusable.
115
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Ourgoalistotaketheskeletonviewercodeandaddtoitsothatitprovidesusmorehelpful
debugginginformation.Weaccomplishthisbycreatingausercontrol,whichwenameSkeletonViewer.
Inthisusercontrol,theskeletonandjointUIelementsaredrawntotheUIrootoftheusercontrol.The
SkeletonViewercontrolcanthenbeachildofanypanelwewant.Startbycreatingausercontroland
replacetherootGridelementwiththecodeinListing4-13.
Listing4-13.SkeletonViewerXAML
<Grid>
<Grid x:Name="SkeletonsPanel"/>
<Canvas x:Name="JointInfoPanel"/>
</Grid>
TheSkeletonsPaneliswherewewilldrawthestickfiguresjustaswedidearlierinthischapter.The
JointInfoPaneliswheretheadditionaldebugginginformationwillgo.We’llgointomoredetailonthis
later.ThenextstepistolinktheusercontrolwithaKinectSensorobject.Forthis,wecreatea
DependencyProperty,whichallowsustousedatabindingifwedesire.Listing4-14hasthecodeforthis
property.TheKinectDeviceChanged staticmethodiscriticaltothefunctionandperformanceofany
applicationusingthiscontrol.Thefirststepunsubscribestheeventhandlerfromthe
SkeletonFrameReadyforanypreviouslyassociatedKinectSensorobject.Notremovingtheeventhandler
causesmemoryleaks.Anevenbetterapproachistousetheweakeventhandlerpattern,thedetailsof
whicharebeyondourscope.TheotherhalfofthismethodsubscribestotheSkeletonFrameReadyevent
whentheKinectDevicepropertyissettoanon-nullvalue.
Listing4-14.RuntimeDependencyProperty
#region KinectDevice
protected const string KinectDevicePropertyName = "KinectDevice";
public static readonly DependencyProperty KinectDeviceProperty =
DependencyProperty.Register(KinectDevicePropertyName,
typeof(KinectSensor),
typeof(SkeletonViewer),
new PropertyMetadata(null, KinectDeviceChanged));
private static void KinectDeviceChanged(DependencyObject owner,
DependencyPropertyChangedEventArgs e)
{
SkeletonViewer viewer = (SkeletonViewer) owner;
if(e.OldValue != null)
{
KinectSensor sensor;
sensor
= (KincetSensor) e.OldValue;
sensor.SkeletonFrameReady -= viewer.KinectDevice_SkeletonFrameReady;
}
}
if(e.NewValue != null)
{
viewer.KinectDevice = (KinectSensor) e.NewValue;
viewer.KinectDevice.SkeletonFrameReady += viewer.KinectDevice_SkeletonFrameReady;
}
116
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
public KinectSensor KinectDevice
{
get { return (KinectSensor)GetValue(KinectDeviceProperty); }
set { SetValue(KinectDeviceProperty, value); }
}
#endregion KinectDevice
NowthattheusercontrolisreceivingnewskeletondatafromtheKinectSensor,wecanstart
drawingskeletons.Listing4-15showstheSkeletonFrameReadyeventhandler.Muchofthiscodeisthe
sameasthatusedinthepreviousexercises.Theskeletonprocessinglogiciswrappedinanifstatement
sothatitonlyexecuteswhenthecontrol’sIsEnabledpropertyissettotrue.Thisallowstheapplication
toturnthefunctionofthiscontrolonandoffeasily.Therearetwooperationsperformedoneach
skeleton.ThemethoddrawstheskeletonbycallingtheDrawSkeletonmethod.DrawSkeletonandits
helpermethods(CreateFigureandGetJointPoint)arethesamemethodsthatweusedinthestickfigure
example.Youcancopyandpastethecodetothissourcefile.
ThenewlinesofcodetocalltheTrackJointmethodareshowninbold.Thismethoddisplaysthe
extrajointinformation.ThesourceforthismethodisalsopartofListing4-15.TrackJointdrawsacircle
atthelocationofthejointanddisplaystheX,Y,andZvaluesnexttothejoint.TheXandYvaluesarein
pixelsandarerelativetothewidthandheightoftheusercontrol.TheZvalueisthedepthconvertedto
feet.IfyouarenotalazyAmericanandknowthemetricsystem,youcanleavethevalueinmillimeters.
117
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Listing4-15.DrawingSkeletonJointsandInformation
private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
SkeletonsPanel.Children.Clear();
JointInfoPanel.Children.Clear();
if(this.IsEnabled)
{
using(SkeletonFrame frame = e.OpenSkeletonFrame())
{
if(frame != null)
{
if(this.IsEnabled)
{
Brush brush;
Skeleton skeleton;
frame.CopySkeletonDataTo(this._FrameSkeletons);
for(int i = 0; i < this._FrameSkeletons.Length; i++)
{
skeleton
= skeletons[i];
brush
= this._SkeletonBrushes[i];
DrawSkeleton(skeleton, brush);
TrackJoint(skeleton.Joints[JointType.HandLeft], brush);
TrackJoint(skeleton.Joints[JointType.HandRight], brush);
//You can track all the joints if you want
}
}
}
}
}
}
private void TrackJoint(Joint joint, Brush brush)
{
if(joint.TrackingState != JointTrackingState.NotTracked)
{
Canvas container = new Canvas();
Point jointPoint = GetJointPoint(joint);
//FeetPerMeters is a class constant of 3.2808399f;
double z = joint.Position.Z * FeetPerMeters;
Ellipse element = new Ellipse();
element.Height = 10;
element.Width
= 10;
element.Fill
= brush;
Canvas.SetLeft(element, 0 - (element.Width / 2));
118
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Canvas.SetTop(element, 0 - (element.Height / 2));
container.Children.Add(element);
TextBlock positionText = new TextBlock();
positionText.Text = string.Format("<{0:0.00}, {1:0.00}, {2:0.00}>",
jointPoint.X, jointPoint.Y, z);
positionText.Foreground = brush;
positionText.FontSize
= 24;
Canvas.SetLeft(positionText, 0 - (positionText.Width / 2));
Canvas.SetTop(positionText, 25);
container.Children.Add(positionText);
Canvas.SetLeft(container, jointPoint.X);
Canvas.SetTop(container, jointPoint.Y);
JointInfoPanel.Children.Add(container);
}
}
AddingtheSkeletonViewertoanapplicationisquickandeasy.SinceitisaUserControl,simplyadd
ittotheXAML,andtheninthemainapplicationsettheKinectDevicepropertyoftheSkeletonViewerto
thedesiredsensorobject.Listing4-16demonstratesthisbyshowingthecodefromtheKinecttheDots
projectthatinitializestheKinectSensorobject.Figure4-7,whichfollowsListing4-16,isascreenshotof
KinecttheDotswiththeSkeletonViewerenabled.
Listing4-16.InitializingtheSkeletonViewer
if(this._KinectDevice != value)
{
//Uninitialize
if(this._KinectDevice != null)
{
this._KinectDevice.Stop();
this._KinectDevice.SkeletonFrameReady -= KinectDevice_SkeletonFrameReady;
this._KinectDevice.SkeletonStream.Disable();
SkeletonViewerElement.KinectDevice = null;
}
this._KinectDevice = value;
}
//Initialize
if(this._KinectDevice != null)
{
if(this._KinectDevice.Status == KinectStatus.Connected)
{
this._KinectDevice.SkeletonStream.Enable();
this._KinectDevice.Start();
SkeletonViewerElement.KinectDevice = this.KinectDevice;
this.KinectDevice.SkeletonFrameReady += KinectDevice_SkeletonFrameReady;
}
}
119
www.it-ebooks.info
CHAPTER4SKELETONTRACKING
Figure4-7.KinecttheDotsusingtheSkeletonViewerusercontrol
Summary
AsyoumayhavenoticedwiththeKinecttheDotsapplication,theKinectcodewasonlyasmallpartof
theapplication.Thisiscommon.Kinectisjustanotherinputdevice.Whenbuildingapplicationsdriven
bytouchormouse,thecodefocusedontheinputdeviceismuchlessthantheotherapplicationcode.
Thedifferenceisthattheextractionoftheinputdataandtheworktoprocessthatdataishandledlargely
foryoubythe.NETframework.AsKinectmatures,itispossiblethatittoowillbeintegratedintothe
.NETframework.ImaginehavinghandcursorsbuiltintotheframeworkorOSjustlikethemousecursor.
Untilthatdayhappens,wehavetowritethiscodeourselves.Theimportantpointoffocusisondoing
stuffwiththeKinectdata,andnotonhowtoextractdatafromtheinputdevice.
Inthischapter,weexaminedeveryclass,property,andmethodfocusedonskeletontracking.Inour
firstexample,theapplicationdemonstratedhowtodrawastickfigurefromtheskeletonpoints,whereas
thesecondprojectwasamorefunctionalreal-worldapplicationexperience.Init,wefoundpractical
usesforthejointdata,inwhichtheuser,forthefirsttime,actuallyinteractedwiththeapplication.The
user’snaturalmovementsprovidedinputtotheapplication.Thischapterconcludesourexplorationof
thefundamentalsoftheSDK.Fromhereonout,weexperimentandbuildfunctionalapplicationstofind
theboundariesofwhatispossiblewithKinect.
120
www.it-ebooks.info
CHAPTER 5
Advanced Skeleton Tracking
Thischaptermarksthebeginningofthesecondhalfofthebook.Thefirstsetofchaptersfocusedonthe
fundamentalcameracentricfeaturesoftheKinectSDK.Weexploredandexperimentedwithevery
methodandpropertyofeveryobjectfocusedonthesefeatures.ThesearethenutsandboltsofKinect
development.YounowhavethetechnicalknowledgenecessarytowriteapplicationsusingKinectand
theSDK.However,knowingtheSDKandunderstandinghowtouseitasatooltobuildgreat
applicationsandexperiencesaresubstantiallydifferentmatters.Theremainingchaptersofthebook
changetoneandcourseintheircoverageoftheSDK.MovingforwardwediscusshowtousetheSDKin
conjunctionwithWPFandotherthird-partytoolsandlibrariestobuildKinect-drivenexperiences.We
willusealltheinformationyoulearnedinthepreviouschapterstoprogresstomoreadvancedand
complextopics.
Atitscore,Kinectonlyemitsanddetectsthereflectionofinfraredlight.Fromthereflectionofthe
light,itcalculatesdepthvaluesforeachpixelofviewarea.Thefirstderivativeofthedepthdataisthe
abilitytodetectblobsandshapes.Playerindexbitsofeachdepthpixeldataareaformofafirst
derivative.Thesecondderivativedetermineswhichoftheseshapesmatchesthehumanform,andthen
calculatesthelocationofeachsignificantaxispointonthehumanbody.Thisisskeletontracking,which
wecoveredinthepreviouschapter.
WhiletheinfraredimageandthedepthdataarecriticalandcoretoKinect,theyarelessprominent
thanskeletontracking.Infact,theyareameanstoanend.AstheKinectandotherdepthcameras
becomemoreprevalentineverydaycomputeruse,therawdepthdatawillreceivelessdirectattention
fromdevelopers,andbecomemerelytriviaorpartofpassingconversation.Wearealmosttherenow.
TheMicrosoftKinectSDKdoesnotgivethedeveloperaccesstotheinfraredimagestreamofKinect,but
otherKinectSDKsmakeitavailable.Itislikelythatmostdeveloperswillneverusetherawdepthdata,
butwillonlyeverworkwiththeskeletondata.However,onceposeandgesturerecognitionbecome
standardizedandintegratedintotheKinectSDK,developerslikelywillnotevenaccesstheskeleton
data.
Wehopetoadvancethismovement,becauseitsignifiesthematurationofKinectasatechnology.
Thischapterkeepsthefocusonskeletontracking,buttheapproachtotheskeletondataisdifferent.We
focusonKinectasaninputdevicewiththesameclassificationasamouse,stylus,ortouch,butuniquely
differentbecauseofitsabilitytoseedepth.MicrosoftpitchedKinectforXboxwith,“Youarethe
controller,”ormoretechnically,youaretheinputdevice.Withskeletondata,applicationscandothe
samethingsamouseortouchdevicecan.Thedifferenceisthedepthcomponentallowstheuserandthe
applicationtointeractasneverbefore.LetusexplorethemechanicsthroughwhichtheKinectcan
controlandinteractwithuserinterfaces.
121
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
UserInteraction
Computersandtheapplicationsthatrunonthemrequireinput.Traditionally,userinputcomesfroma
keyboardandmouse.Theuserinteractsdirectlywiththesehardwaredevices,whichinturn,transmit
datatothecomputer.Thecomputertakesthedatafromtheinputdeviceandcreatessometypeofvisual
effect.Itiscommonknowledgethateverycomputerwithagraphicaluserinterfacehasacursor,which
isoftenreferredtoasthemousecursor,becausethemousewastheoriginalvehicleforthecursor.
However,callingitamousecursorisnolongerasaccurateasitoncewas.Touchorstylusdevicesalso
controlthesamecursorasthemouse.Whenausermovesthemouseordragshisorherfingeracrossa
touchscreen,thecursorreactstothesemovements.Ifausermovesthecursoroverabutton,moreoften
thannotthebuttonchangesvisuallytoindicatethatthecursorishoveringoverthebutton.Thebutton
givesanothertypeofvisualindicatorwhentheuserpressesthemousebuttonwhilehoveringovera
button.Stillanothervisualindicatoremergeswhentheuserreleasesthemousebuttonwhileremaining
overabutton.Thisprocessmayseemtrivialtothinkthroughstepbystep,buthowmuchofthisprocess
doyoureallyunderstand?Ifyouhadto,couldyouwritethecodenecessarytotrackchangesinthe
mouse’sposition,hoverstates,andbuttonclicks?
Theseareuserinterfaceorinteractionsdevelopersoftentakeforgranted,becausewithuser
interfaceplatformslikeWPF,interactingwithinputdevicesisextremelyeasy.Whendevelopingweb
pages,thebrowserhandlesuserinteractionsandthedevelopersimplydefinesthevisualtreatmentslike
mousehoverstatesusingstylesheets.However,Kinectisdifferent.Itisaninputdevicethatisnot
integratedintoWPF.Thereforeyou,asthedeveloper,areresponsiblefordoingalloftheworkthatthe
OSandWPFotherwisewoulddoforyou.
Atalowlevel,amouse,stylus,ortouchessentiallyproducesX,andYcoordinates,whichtheOS
translatesintothecoordinatespaceofthecomputerscreen.Thisprocessissimilartothatdiscussedin
thepreviouschapter(SpaceTransformations).Itistheoperatingsystem’sresponsiblytoextractdata
fromtheinputdeviceandmakeitavailablethegraphicaluserinterfaceandtoapplications.The
graphicaluserinterfaceoftheOSdisplaysamousecursorandmovesthecursoraroundthescreenin
reactiontouserinput.Insomeinstances,thisworkisnottrivialandrequiresathoroughunderstanding
oftheGUIplatform,whichinourinstanceisWPF.WPFdoesnotprovidenativesupportfortheKinect
asitdoesforthemouseandotherinputdevices.Theburdenfallsonthedevelopertopullthedatafrom
theKinectSDKanddotheworknecessarytointeractwiththeButtons,ListBoxesandotherinterface
controls.Dependingonthecomplexityofyourapplicationoruserinterface,thiscanbeasizabletask
andpotentiallyonethatisnon-trivialandrequiresintimateknowledgeofWPF.
ABriefUnderstandingoftheWPFInputSystem
WhenbuildinganapplicationinWPF,developersdonothavetoconcernthemselveswiththe
mechanicsofuserinput.Itishandledforusallowingustofocusmoreonreactingtouserinput.Afterall,
asdevelopers,wearemoreconcernedwithdoingthingswiththeuser’sinputratherthanreinventing
thewheeleachtimejusttocollectuserinput.Ifanapplicationneedsabutton,thedeveloperaddsa
Buttoncontroltothescreen,wiresaneventhandlertothecontrol’sClickeventandisdone.Inmost
circumstances,thedeveloperwillstylethebuttontohaveauniquelookandfeelandtoreactvisuallyto
differentmouseinteractionssuchashoverandmousedown.WPFhandlesallofthelow-levelworkto
determinewhenthemouseishoveringoverthebutton,orwhenthebuttonisclicked.
WPFhasarobustinputsystemthatconstantlygathersinputfromattacheddevicesanddistributes
thatinputtotheaffectedcontrols.ThissystemstartswiththeAPIdefinedintheSystem.Windows.Input
namespace(Presentation.Core.dll).Theentitiesdefinedwithinworkdirectlywiththeoperatingsystem
togetdatafromtheinputdevices.Forexample,thereareclassesnamedKeyboard,Mouse,Stylus,Touch,
andCursor.Theoneclassthatisresponsibleformanagingtheinputfromthedifferentinputdevicesand
marshallingthatinputtotherestofthepresentationframeworkistheInputManager.
122
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
TheothercomponenttotheWPFinputsystemisasetoffourclassesintheSystem.Windows
namespace(PresentationCore.dll).TheseclassesareUIElement,ContentElement,FrameworkElement,and
FrameworkContentElement.FrameworkElementinheritsfromUIElementandFrameworkContentElement
inheritsfromContentElement.TheseclassesarethebaseclassesforallvisualelementsinWPFsuchas
Button,TextBlock,andListBox.
NoteFormoredetailedinformationaboutWPF’sinputsystem,refertotheMSDNdocumentationat
http://msdn.microsoft.com/en-us/library/ms754010.aspx.
TheInputManagertracksalldeviceinputandusesasetofmethodsandeventstonotifyUIElement
andContentElementobjectsthattheinputdeviceisperformingsomeactionrelatedtothevisual
element.Forexample,WPFraisestheMouseEnterEventeventwhenthemousecursorentersthevisual
spaceofavisualelement.ThereisalsoavirtualOnMouseEntermethodintheUIElementand
ContentElementclasses,whichWPFalsocallswhenthemouseentersthevisualspaceoftheobject.This
allowsotherobjects,whichinheritfromtheUIElementorContentElementclasses,todirectlyreceivedata
frominputdevices.WPFcallsthesemethodsonthevisualelementsbeforeitraisesanyinputevents.
ThereareseveralothersimilartypesofeventsandmethodsontheUIElementandContentElementclasses
tohandlethevarioustypesofinteractionsincludingMouseEnter,MouseLeave,MouseLeftButtonDown,
MouseLeftButtonUp,TouchEnter,TouchLeave,TouchUp,andTouchDown,tonameafew.
Developershavedirectaccesstothemouseandotherinputdevicesneeded.TheInputManager
objecthasapropertynamedPrimaryMouseDevice,whichreturnsaMouseDeviceobject.Usingthe
MouseDeviceobject,youcangetthepositionofthemouseatanytimethroughamethodnamed
GetScreenPosition.Additionally,theMouseDevicehasamethodnamedGetPosition,whichtakesina
userinterfaceelementandreturnsthemousepositionwithinthecoordinatespaceofthatelement.This
informationiscrucialwhendeterminingmouseinteractionssuchasthemousehoverevent.Witheach
newSkeletonFramegeneratedbytheKinectSDK,wearegiventhepositionofeachskeletonjointin
relationtoskeletonspace;wethenhavetoperformcoordinatespacetransformstotranslatethejoint
positionstobeusablewithvisualelements.TheGetScreenPositionandGetPositionmethodsonthe
MouseDeviceobjectdothisworkforthedeveloperformouseinput.
Insomeways,Kinectiscomparablewiththemouse,butthecomparisonsabruptlybreakdown.
Skeletonjointsenterandleavevisualelementssimilartoamouse.Inotherwords,jointshoverlikea
mousecursor.However,theclickandmousebuttonupanddowninteractionsdonotexist.Aswewill
seeinthenextchapter,therearegesturesthatsimulateaclickthroughapushgesture.Thebuttonpush
metaphorisweakwhenappliedtoKinectandsothecomparisonwiththemouseendswiththehover.
Kinectdoesnothavemuchincommonwithtouchinputeither.Touchinputisavailablefromthe
TouchandTouchDeviceclasses.Singletouchinputissimilartomouseinput,whereasmultipletouch
inputisakintoKinect.Themousehasonlyasingleinteractionpoint(thepointofthemousecursor),
buttouchinputcanhavemultipleinputpoints,justasKinectcanhavemultipleskeletons,andeach
skeletonhastwentyinputpoints.Kinectismoreinformative,becauseweknowwhichinputpoints
belongtowhichuser.Withtouchinput,theapplicationhasnowayofknowinghowmanyusersare
actuallytouchingthescreen.Iftheapplicationreceivestentouchinputs,isitonepersonpressingallten
fingers,orisittenpeoplepressingonefingereach?Whiletouchinputhasmultipleinputpoints,itisstill
atwo-dimensionalinputlikethemouseorstylus.Tobefair,touchinputdoeshavebreadth,meaningit
includesalocation(X,Y)ofthepointandtheboundingareaofthecontactpoint.Afterall,auser
pressingafingeronatouchscreenisneveraspreciseasamousepointerorstylus;italwayscoversmore
thanonepixel.
123
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Whiletherearesimilarities,Kinectinputclearlydoesnotneatlyconformtofittheformofanyinput
devicesupportedbyWPF.Ithasauniquesetofinteractionsanduserinterfacemetaphors.Ithasyetto
bedeterminedifKinectshouldfunctioninthesamewayasotherinputdevices.Atthecore,themouse,
touch,orstylusreportasinglepixelpointlocation.Theinputsystemthendeterminesthelocationofthe
pixelpointinthecontextofavisualelement,andthatvisualelementreactsaccordingly.CurrentKinect
userinterfacesattempttousethehandjointsasalternativestomouseortouchinput,butitisnotclear
yetifthisishowKinectshouldbeusedorifthedesigneranddevelopercommunityissimplytryingto
makeKinectconformtoknownformsofuserinput.
TheexpectationisthatatsomepointKinectwillbefullyintegratedintoWPF.UntilWPF4.0,touch
inputwasaseparatecomponent.TouchwasfirstintroducedwithMicrosoft’sSurface.TheSurfaceSDK
includedaspecialsetofWPFcontrolslikeSurfaceButton,SurfaceCheckBoxandSurfaceListBox.Ifyou
wantedabuttonthatrespondedtotouchevents,youhadtousetheSurfaceButtoncontrol.
OnecanspeculatethatifKinectinputweretobeassimilatedintoWPF,theremightbeaclass
namedSkeletonDevice,whichwouldlooksimilartotheSkeletonFrameobjectoftheKinectSDK.Each
SkeletonobjectwouldhaveamethodnamedGetJointPoint,whichwouldfunctionliketheGetPosition
methodonMouseDeviceortheGetTouchPointonTouchDevice.Additionally,thecorevisualelements
(UIElement,ContentElement,FrameworkElement,andFrameworkContentElement)wouldhaveeventsand
methodstonotifyandhandleskeletonjointinteractions.Forexample,theremightbeJointEnter,
JointLeave,andJointHoverevents.Further,justastouchinputhastheManipulationStartedand
ManipulationEndedevents,theremightbeGestureStartedandGestureEndedeventsassociatedwith
Kinectinput.
Fornow,theKinectSDKisaseparateentityfromWPF,andassuch,itdoesnotnativelyintegrate
withtheinputsystem.Itistheresponsibilityofthedevelopertotrackskeletonjointpositionsand
determinewhenjointpositionsintersectwithuserinterfaceelements.Whenaskeletonjointiswithin
thecoordinatespaceofavisualelement,wemustthenmanuallyaltertheappearanceoftheelementto
reacttotheinteraction.Woeisthelifeofadeveloperwhenworkingwithanewtechnology.
DetectingUserInteraction
Beforewecandetermineifauserhasinteractedwithvisualelementsonthescreen,wemustdefine
whatitmeansfortheusertointeractwithavisualelement.Lookingatamouse-orcursor-driven
application,therearetwowell-knowninteractions.Amousehoversoveravisualelementandclicks.
Theseinteractionsbreakdownevenfurtherintoothermoregranularinteractions.Foracursortohover,
itmustenterthecoordinatespaceofthevisualelement.Thehoverinteractionendswhenthecursor
leavesthecoordinatespaceofthevisualelement.InWPF,theMouseEnterandMouseLeaveeventsfire
whentheuserperformstheseinteractions.Aclickistheactofthemousebuttonbeingpresseddown
(MouseDown)andreleased(MouseUp).
Thereisanothercommonmouseinteractionbeyondaclickandhover.Ifauserhoversoveravisual
element,pressesdowntheleftmousebutton,andthenmovesthecursoraroundthescreen,wecallthis
adrag.Thedropinteractionhappenswhentheuserreleasesthemousebutton.Draganddropisa
complexinteraction,muchlikeagesture.
Forthepurposeofthischapter,wefocusonthefirstsetofsimpleinteractionswherethecursor
hovers,enters,andleavesthespaceofthevisualelement.IntheKinecttheDotsprojectfromthe
previouschapter,wehadtodeterminewhentheuser’shandwasinthevicinityofadotbeforedrawinga
connectingline.Inthatproject,theapplicationdidnotinteractwiththeuserinterfaceasmuchasthe
userinterfacereactedtotheuser.Thisdistinctionisimportant.Theapplicationgeneratedthelocations
ofthedotswithinacoordinatespacethatwasthesameasthescreensize,butthesepointswerenot
derivedfromthescreenspace.Theywerejustdatastoredinvariables.Wefixedthescreensizetomakeit
easy.Uponreceiptofeachnewskeletonframe,thepositionoftheskeletonhandwastranslatedintothe
coordinatespaceofthedots,afterwhichwedeterminedifthepositionofthehandwasthesameasthe
124
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
currentdotinthesequence.Technically,thisapplicationcouldfunctionwithoutauserinterface.The
userinterfacewascreateddynamicallyfromdata.Inthatapplication,theuserisinteractingwiththe
dataandnottheuserinterface.
HitTesting
Determiningwhenauser’shandisnearadotisnotassimpleascheckingifthecoordinatesofthehand
matchexactlythepositionofthedot.Eachdotisjustasinglepixel,anditwouldbeimpossibleforauser
toplacetheirhandeasilyandroutinelyinthesamepixelposition.Tomaketheapplicationusable,we
donotrequirethepositionofthehandtobethesameasthedot,butratherwithinacertainrange.We
createdacirclewithasetradiusaroundthedot,withthedotbeingthecenterofthecircle.Theuserjust
hastobreaktheplaneoftheproximitycircleforthehandtobeconsideredhoveringoverthedot.Figure
5-1illustratesthis.Thewhitedotwithinthevisualelementcircleistheactualdotpointandthedotted
circleistheproximitycircle.Thehandimageiscenteredtothehandpoint(whitedotwithinthehand
icon).Itisthereforepossibleforthehandimagetocrosstheproximitycircle,butthehandpointtobe
outsidethedot.Theprocessofcheckingtoseeifthehandpointbreakstheplaneofthedotiscalledhit
testing.
Figure5-1.Dotproximitytesting
Again,intheKinecttheDotsproject,theuserinterfacereactstothedata.Thedotsaredrawnonthe
screenaccordingtothegeneratedcoordinates.Theapplicationperformshittestingusingthedotdata
andnotthesizeandlayoutofthevisualelement.Mostapplicationsandgamesdonotfunctionthisway.
Theuserinterfacesaremorecomplexandoftendynamic.Take,forexample,theShapeGameapplication
(Figure5-2)thatcomeswiththeKinectforWindowsSDK.Itgeneratesshapesthatdropfromthesky.
Theshapespopanddisappearwhentheuser“touches”them.
125
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Figure5-2.MicrosoftSDKsampleShapeGame
AnapplicationlikeShapeGamerequiresamorecomplexhittestingalgorithmthanthatofKinectthe
Dots.WPFprovidessometoolstohelphittestvisualobjects.TheVisualTreeHelperclass
(System.Windows.Medianamespace)hasamethodnamedHitTest.Therearemultipleoverloadsforthis
method,buttheprimarymethodsignaturetakesinaVisualobjectandapoint.Itreturnsthetop-most
visualobjectwithinthespecifiedvisualobject’svisualtreeatthatpoint.Ifthatseemscomplicatedandit
isnotinherentlyobviouswhatthismeans,donotworry.AsimpleexplanationisthatWPFhasalayered
visualoutput.Morethanonevisualelementcanoccupythesamerelativespace.Ifmorethanonevisual
elementisatthespecifiedpoint,theHitTestmethodreturnstheelementatthetoplayer.DuetoWPF’s
stylingandtemplatingsystem,whichallowscontrolstobecompositesofoneormorevisualelements
andothercontrols,moreoftenthannottherearemultiplevisualelementsatanygivencoordinatepoint.
Figure5-3helpstoillustratethelayeringofvisualelements.Therearethreeelements:aRectangle,a
Button,andanEllipse.AllthreeareinaCanvaspanel.Theellipseandthebuttonsitontopofthe
rectangle.Inthefirstframe,themouseisovertheellipseandahittestatthispointreturnstheellipse.A
hittestinthesecondframereturnstherectangleeventhoughitisthebottomlayer.Whiletherectangle
isatthebottom,itistheonlyvisualelementoccupyingthepixelatthemouse’scursorposition.Inthe
thirdframe,thecursorisoverthebutton.HittestingatthispointreturnsaTextBlockelement.Ifthe
cursorwerenotonthetextinthebutton,ahittestwouldreturnaButtonChromeelement.Thebutton’s
visualrepresentationiscomposedofoneormorevisualcontrols,andiscustomizable.Infact,the
buttonhasnoinherentvisualstyle.AButtonisavisualelementthatinherentlyhasnovisual
126
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
representation.ThebuttonshowninFigure5-3usesthedefaultstyle,whichisinpartmadeupofa
TextBlockandaButtonChrome.Itisimportanttounderstandhittestingonacontroldoesnotnecessarily
meanthehittestreturnsthedesiredorexpectedvisualelementorcontrol,asisthecasewiththeButton.
Inthisexample,wealwaysgetoneoftheelementsthatcomposethebuttonvisual,butnevertheactual
buttoncontrol.
Figure5-3.LayeredUIelements
Tomakehittestingmoreconvienant,WPFprovidesothermethodstoassistwithhittesting.The
UIElementclassdefinesanInputHitTestmethod,whichtakesinaPointandreturnsanIInputElement
thatisatthespecifiedpoint.TheUIElementandContentElementclassesbothimplementthe
IInputElementinterface.ThismeansthatvirtuallyalluserinterfaceelementswithinWPFarecovered.
TheVisualTreeHelperclassalsohasasetofHitTestmethods,whichcanbeusedmoregenerically.
NoteTheMSDNdocumentationfortheUIElement.InputHitTestmethodstates,“Thismethodtypicallyisnot
calledfromyourapplicationcode.Callingthismethodisonlyappropriateifyouintendtore-implementa
substantialamountofthelowlevelinputfeaturesthatarealreadypresent,suchasrecreatingmousedevice
logic.”KinectisnotintegratedintoWPF’s“low-levelinputfeatures”;therefore,itisnecessarytorecreatemouse
devicelogic.
InWPF,hittestingdependsontwovariables,avisualelementandapoint.Thetestdeterminesif
thespecifiedpointlieswithinthecoordinatespaceofthevisualelement.Let’suseFigure5-4tobetter
understandcoordinatespacesofvisualelements.EachvisualelementinWPF,regardlessofshapeand
size,haswhatiscalledaboundingbox:arectangularshapearoundthevisualelementthatdefinesthe
widthandheightofthevisualelement.Thisboundingboxisusedbythelayoutsystemtodeterminethe
overalldimensionsofthevisualelementandhowtoarrangeitonthescreen.WhiletheCanvasarranges
itschildrenbasedonvaluesspecifiedbythedeveloper,anelement’sboundingboxisfundamentalto
thelayoutalgorithmofotherpanelssuchtheGridandStackPanel.Theboundingboxisnotvisually
showntotheuser,butisrepresentedinFigure5-4bythedottedboxsurroundingeachvisualelement.
Additionally,eachelementhasanXandYpositionthatdefinestheelement’slocationwithinitsparent
container.ToobtaintheboundingboxandpositionofanelementcalltheGetLayoutSlotmethodofthe
LayoutInformation(static)class(System.Windows.Controls.Primitives).
Take,forexample,thetriangle.Thetop-leftcorneroftheboundingboxispoint(0,0)ofthevisual
element.Thewidthandheightofthetriangleareeach200pixels.Thethreepointsofthetrianglewithin
127
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
theboundingboxareat(100,0),(200,200),(0,200).Ahittestisonlysuccessfulforpointswithinthe
triangleandnotforallpointswithintheboundingbox.Ahittestforpoint(0,0)isunsuccessful,whereas
atestatthecenterofthetriangle,point(100,100),issuccessful.
Figure5-4.Layoutspaceandboundingboxes
Hittestingresultsdependonthelayoutofthevisualelements.Inallofourprojects,weusedthe
Canvaspaneltoholdourvisualelements.TheCanvaspanelistheonevisualelementcontainerthatgives
thedevelopercompletecontrolovertheplacementofthevisualelements,whichcanbeespecially
usefulwhenworkingwithKinect.BasicfunctionslikehandstrackingarepossiblewithotherWPF
panels,butrequiremoreworkanddonotperformaswellastheCanvaspanel.WiththeCanvaspanel,the
developerexplicitlysetstheXandYposition(CanvasLeftandCanvasToprespectively)ofthechildvisual
element.Coordinatespacetranslation,aswehaveseen,isstraightforwardwiththeCanvaspanel,which
meanslesscodetowriteandbetterperformancebecausethereislessprocessingneeded.
ThedisadvantagetousingaCanvasisthesamereasonforusingtheCanvaspanel.Thedeveloperhas
completecontrolovertheplacementofvisualelementsandthereforeisalsoresponsibleforthingslike
updatingelementpositionswhenthewindowresizesorarrangingcomplexlayouts.Panelssuchasthe
GridandStackPanelmakeUIlayoutupdatesandresizingspainlesstothedeveloper.However,these
panelsincreasethecomplexityofhittestingbyincreasingthesizeofthevisualtreeandbyadding
additionalcoordinatespaces.Themorecoordinatespaces,themorepointtranslationsneeded.These
panelsalsohonorthealignment(horizontalandvertical)andmarginpropertiesofthe
FrameworkElement,whichfurthercomplicatesthecalculationsnecessaryforhittesting.Ifthereisany
possibilitythatavisualelementwillhaveRenderTransforms,youwillbesmarttousetheWPFhittesting
andnotattemptdothistestingyourself.
128
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Ahybridapproachistoplacevisualelementsthatchangefrequentlybasedonskeletonjoint
positions,suchashandcursorsinaCanvas,andplaceallofUIelementsinotherpanels.Suchalayout
schemerequiresmorecoordinatespacetransforms,whichcanaffectperformanceandpossibly
introducebugsrelatedtoimpropertransformcalculations.Thehybridmethodisattimesthemore
appropriatechoicebecauseittakesfulladvantageoftheWPFlayoutsystem.RefertotheMSDN
documentationonWPF’slayoutsystem,panelsandhittestingforathoroughunderstandingofthese
concepts.
RespondingtoInput
Hittestingonlytellsusthattheuserinputpointiswithinthecoordinatespaceofavisualelement.One
oftheimportantfunctionsofauserinterfaceistogiveusersfeedbackontheiractions.Whenwemove
ourmouseoverabutton,weexpectthebuttontochangevisuallyinsomeway(changecolor,growin
size,animate,revealabackgroundglow),tellingtheuserthebuttonisclickable.Withoutthisfeedback,
theuserexperienceisnotonlyflatanduninteresting,butalsopossiblyconfusingandfrustrating.A
failedapplicationexperiencemeanstheapplicationasawholeisafailure,evenifittechnicallyfunctions
flawlessly.
WPFhasafantasticsystemfornotifyingandrespondingtouserinput.Thestylingandtemplate
systemmakesdevelopinguserinterfacesthatproperlyrespondtouserinputeasytobuildandhighly
customizable,butonlyifyouruserinputcomesfromamouse,stylus,oratouchdevice.Kinect
developershavetwooptions:donotuseWPF’ssystemanddoeverythingmanually,orcreatespecial
controlsthatrespondtoKinectinput.Thelatter,whilenotoverlydifficult,isnotabeginner’stask.
Withthisinmind,wemovetothenextsectionwherewebuildagamethatapplieshittestingand
manuallyrespondstouserinput.Before,movingon,consideraquestion,whichwehavepurposefully
notaddresseduntilnow.WhatdoesitmeanforaKinectskeletontointeractwiththeuserinterface?The
coremouseinteractionsare:enter,leave,click.Touchinputhasenter,leave,down,andupinteractions.
Amousehasasinglepositionpoint.Touchcanhavemultiplepositionpoints,butthereisalwaysa
primarypoint.AKinectskeletonhastwentypossiblepositionpoints.Whichoftheseistheprimary
point?Shouldtherebeaprimarypoint?Shouldavisualelement,suchasabutton,reactwhenanyone
skeletonpointenterstheelement’scoordinatespace,orshoulditreacttoonlycertainjointpoints,for
instancethehands?
Thereisnooneanswertoallofthesequestions.Itlargelydependsonthefunctionanddesignof
youruserinterface.ThesetypesofquestionsarepartofabroadersubjectcalledNaturalUserInterface
design,whichisasignificanttopicinthenextchapter.FormostKinectapplications,includingthe
projectsinthischapter,theonlyjointsthatinteractwiththeuserinterfacearethehands.Thestarting
interactionsareenterandleave.Interactionsbeyondthesebecomecomplicatedquickly.Wecovermore
complicatedinteractionslaterinthechapterandallofthenextchapter,butnowthefocusisonthe
basics.
SimonSays
TodemonstrateworkingwithKinectasaninputdevice,westartournextproject,whichusesthehand
jointsasiftheywereacrossbetweenamouseandtouchinput.Theproject’sgoalistogiveapractical,
butintroductoryexampleofhowtoperformhittestingandcreateuserinteractionswithWPFvisual
elements.TheprojectisagamenamedSimonSays.
Growingupduringmyearlygradeschoolyears,weplayedagamenamedSimonSays.Inthisgame,
onepersonplaystheroleofSimonandgivesinstructionstotheotherplayers.Atypicalinstructionis,
“Putyourlefthandontopofyourhead.”Playersperformtheinstructiononlyifitisprecededbythe
words,“SimonSays.”Forexample,“Simonsays,‘stompyourfeet”incontrastto“stompyourfeet.”Any
129
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
playercaughtfollowinganinstructionnotprecededby“Simonsays”isoutofthegame.Thesearethe
game’srules.DidyouplaySimonSaysasachild?Dokidsstillplaythisgame?Lookitupifyoudonot
knowthegame.
TipThetraditionalversionofSimonSaysmakesafundrinkinggame—butonlyifyouareoldenoughtodrink.
Pleasedrinkresponsibly.
Inthelate70sandearly80sthegamecompanyMiltonBradleycreatedahand-heldelectronic
versionofSimonSaysnamedSimon.Thisgameconsistedoffourcolored(red,blue,green,andyellow)
buttons.Theelectronicversionofthegameworkswiththecomputer,givingtheplayerasequenceof
buttonstopress.Whengivingtheinstructions,thecomputerlightseachbuttoninthecorrectsequence.
Theplayermustthenrepeatthebuttonsequence.Aftertheplayersuccessfullyrepeatsthebutton
sequence,thecomputerpresentsanother.Thesequencesbecomeprogressivelymorechallenging.The
gameendswhentheplayercannotrepeatthesequence.
WeattempttorecreatetheelectronicversionofSimonSaysusingKinect.Itisaperfectintroductory
exampleofusingskeletontrackingtointeractwithuserinterfaceelements.Thegamealsohasasimple
setofgamerules,whichwecanquicklyimplement.Figure5-5illustratesourdesireduserinterface.It
consistsoffourrectangles,whichserveasgamebuttonsortargets.Wehaveagametitleatthetopofthe
screen,andanareainthemiddleofthescreenforgameinstructions.
Figure5-5.SimonSaysuserinterface
130
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
OurversionofSimonSaysworksbytrackingtheplayer’shands;whenahandmakescontactwith
oneofthecoloredsquares,weconsiderthisabuttonpress.ItiscommoninKinectapplicationstouse
hoverorpressgesturestointeractwithbuttons.Fornow,ourapproachtoplayerinteractionsremains
simple.Thegamestartswiththeplayerplacingherhandsoverthehandmarkersintheredboxes.
Immediatelyafterbothhandsareonthemarkers,thegamebeginsissuinginstructions.Thegameisover
andreturnstothisstatewhentheplayerfailstorepeatthesequence.Atthispoint,wehaveabasic
understandingofthegame’sconcept,rules,andlook.Nowwewritecode.
SimonSays,“DesignaUserInterface”
Startbybuildingtheuserinterface.Listing5-1showstheXAMLfortheMainWindow.Aswithourprevious
examples,wewrapourmainUIelementsinaViewboxcontroltohandlescalingtodifferentmonitor
resolutions.OurUIdimensionsaresetto1920x1080.TherearefoursectionsofourUI:titleand
instructions,gameinterface,gamestartinterface,andcursorsforhandtracking.ThefirstTextBlock
holdsthetitleandtheinstructionUIelementsareintheStackPanelthatfollows.TheseUIcomponents
serveonlytohelptheplayerknowthecurrentstateofthegame.Theyhavenootherfunctionandare
notrelatedtoKinectorskeletontracking.However,theotherUIelementsare.
TheGameCanvas,ControlCanvas,andHandCanvasallholdUIelements,whichtheapplication
interactswith,basedonthepositionoftheplayer’shands.Thehandpositionsobviouslycomefrom
skeletontracking.Takingtheseitemsinreverseorder,theHandCanvasshouldbefamiliar.The
applicationhastwocursorsthatfollowthemovementsoftheplayer’shands,aswesawintheprojects
fromthepreviouschapter.TheControlCanvasholdstheUIelementsthattriggerthestartofthegame,
andtheGameCanvasholdstheblocks,whichtheplayerpressesduringthegame.Thedifferentinteractive
componentsarebrokenintomultiplecontainers,makingtheuserinterfaceeasiertomanipulatein
code.Forexample,whentheuserstartsthegame,wewanttohidetheControlCanvas.Itismucheasier
tohideonecontainerthantowritecodetoshowandhideallofthechildrenindividually.
AfterupdatingtheMainWindow.xamlfilewiththecodeinListingin5-1,runtheapplication.The
screenshouldlooklikeFigure5-1.
131
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-1.SimonSaysUserInterface
<Window x:Class="SimonSays.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:c="clr-namespace:SimonSays"
Title="Simon Says" WindowState="Maximized">
<Viewbox>
<Grid x:Name="LayoutRoot" Height="1080" Width="1920" Background="White"
TextElement.Foreground="Black">
<TextBlock Text="Simon Says" FontSize="72" Margin="0,25,0,0"
HorizontalAlignment="Center" VerticalAlignment="Top"/>
<StackPanel HorizontalAlignment="Center" VerticalAlignment="Center" Width="600">
<TextBlock x:Name="GameStateElement" FontSize="55" Text="GAME OVER!"
HorizontalAlignment="Center"/>
<TextBlock x:Name="GameInstructionsElement"
Text="Place hands over the targets to start a new game."
FontSize="45" HorizontalAlignment="Center"
TextAlignment="Center" TextWrapping="Wrap" Margin="0,20,0,0"/>
</StackPanel>
<Canvas x:Name="GameCanvas">
<Rectangle x:Name="RedBlock" Height="400" Width="400" Fill="Red"
Canvas.Left="170" Canvas.Top="90" Opacity="0.2"/>
<Rectangle x:Name="BlueBlock" Height="400" Width="400" Fill="Blue"
Canvas.Left="170" Canvas.Top="550" Opacity="0.2"/>
<Rectangle x:Name="GreenBlock" Height="400" Width="400" Fill="Green"
Canvas.Left="1350" Canvas.Top="550" Opacity="0.2"/>
<Rectangle x:Name="YellowBlock" Height="400" Width="400" Fill="Yellow"
Canvas.Left="1350" Canvas.Top="90" Opacity="0.2"/>
</Canvas>
<Canvas x:Name="ControlCanvas">
<Border x:Name="RightHandStartElement" Background="Red" Height="200"
Padding="20" Canvas.Left="1420" Canvas.Top="440">
<Image Source="Images/hand.png"/>
</Border>
<Border x:Name="LeftHandStartElement" Background="Red" Height="200"
Padding="20" Canvas.Left="300" Canvas.Top="440">
<Image Source="Images/hand.png">
<Image.RenderTransform>
<TransformGroup>
<TranslateTransform X="-130"/>
<ScaleTransform ScaleX="-1"/>
</TransformGroup>
</Image.RenderTransform>
</Image>
</Border>
</Canvas>
132
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
<Canvas x:Name="HandCanvas">
<Image x:Name="RightHandElement" Source="Images/hand.png"
Visibility="Collapsed" Height="100" Width="100"/>
<Image x:Name="LeftHandElement" Source="Images/hand.png"
Visibility="Collapsed" Height="100" Width="100">
<Image.RenderTransform>
<TransformGroup>
<ScaleTransform ScaleX="-1"/>
<TranslateTransform X="90"/>
</TransformGroup>
</Image.RenderTransform>
</Image>
</Canvas>
</Grid>
</Viewbox>
</Window>
SimonSays,“BuildtheInfrastructure”
WiththeUIinplace,weturnourfocusonthegame’sinfrastructure.UpdatetheMainWindow.xaml.csfile
toincludethenecessarycodetoreceiveSkeletonFrameReadyevents.IntheSkeletonFrameReadyevent
handler,addthecodetotrackplayerhandmovements.ThebaseofthiscodeisinListing5-2.TrackHand
isarefactoredversionofListing4-7,wherethemethodtakesintheUIelementforthecursorandthe
parentelementthatdefinesthelayoutspace.
133
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-2.InitialSkeletonFrameReadyEventHandler
private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
using(SkeletonFrame frame = e.OpenSkeletonFrame())
{
if(frame != null)
{
frame.CopySkeletonDataTo(this._FrameSkeletons);
Skeleton skeleton = GetPrimarySkeleton(this._FrameSkeletons);
if(skeleton == null)
{
LeftHandElement.Visibility = Visibility.Collapsed;
RightHandElement.Visibility = Visibility.Collapsed;
}
else
{
TrackHand(skeleton.Joints[JointType.HandLeft], LeftHandElement, LayoutRoot);
TrackHand(skeleton.Joints[JointType.HandRight], RightHandElement, LayoutRoot);
}
}
}
}
private static Skeleton GetPrimarySkeleton(Skeleton[] skeletons)
{
Skeleton skeleton = null;
if(skeletons != null)
{
//Find the closest skeleton
for(int i = 0; i < skeletons.Length; i++)
{
if(skeletons[i].TrackingState == SkeletonTrackingState.Tracked)
{
if(skeleton == null)
{
skeleton = skeletons[i];
}
else
{
if(skeleton.Position.Z > skeletons[i].Position.Z)
{
skeleton = skeletons[i];
}
}
}
}
}
134
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
return skeleton;
}
Formostgamesusingapollingarchitectureisthebetterandmorecommonapproach.Normally,a
gamehaswhatiscalledagamingloop,whichwouldmanuallygetthenextskeletonframefromthe
skeletonstream.However,thisprojectusestheeventmodeltoreducethecodebaseandcomplexity.For
thepurposesofthisbook,lesscodemeansthatitiseasiertopresenttoyou,thereader,andeasierto
understandwithoutgettingboggeddowninthecomplexitiesofgamingloopsandpossiblythreading.
Theeventsystemalsoprovidesusacheapgamingloop,which,again,meanswehavetowritelesscode.
However,becarefulwhenusingtheeventsysteminplaceofatruegamingloop.Besidesperformance
concerns,eventsareoftennotreliableenoughtooperateasatruegamingloop,whichmayresultin
yourapplicationbeingbuggyornotperformingasexpected.
SimonSays,“AddGamePlayInfrastructure”
ThegameSimonSaysbreaksdownintothreephases.Theinitialphase,whichwewillcallGameOver,
meansnogameisactivelybeingplayed.Thisisthedefaultstateofthegame.Itisalsothestatetowhich
thegamerevertswhenKinectstopsdetectingplayers.ThegameloopsfromSimongivinginstructionsto
theplayerrepeatingorperformingtheinstructions.Thiscontinuesuntiltheplayercannotcorrectly
performtheinstructions.Theapplicationdefinesanenumerationtodescribethegamephasesanda
membervariabletotrackthegamestate.Additionally,weneedamembervariabletotrackthecurrent
roundorlevelofthegame.Thevalueoftheleveltrackingvariableincrementseachtimetheplayer
successfullyrepeatsSimon’sinstructions.Listing5-3detailsthegamephaseenumerationandmember
variables.Themembervariablesinitializationisintheclassconstructor.
135
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-3.GamePlayInfrastructure
public enum GamePhase
{
GameOver
SimonInstructing
PlayerPerforming
}
= 0,
= 1,
= 2
public partial class MainWindow : Window
{
#region Member Variables
private KinectSensor _KinectDevice;
private Skeleton[] _FrameSkeletons;
private GamePhase _CurrentPhase;
private int _CurrentLevel;
#endregion Member Variables
#region Constructor
public MainWindow()
{
InitializeComponent();
//Any other constructor code such as sensor initialization goes here.
this._CurrentPhase = GamePhase.GameOver
this._CurrentLevel = 0;
}
#endregion Constructor
#region Methods
//Code from listing 5-2 and any additional supporting methods
#endregion Methods
}
WenowrevisittheSkeletonFrameReadyeventhandler,whichneedstodeterminewhatactiontotake
basedonthestateoftheapplication.ThecodeinListing5-4detailsthecodechanges.Updatethe
SkeletonFrameReadyeventhandlerwiththiscodeandstubouttheChangePhase,ProcessGameOver,and
ProcessPlayerPerformingmethods.Wecoverthefunctionalcodeofthesemethodslater.Thefirst
methodtakesonlyaGamePhaseenumerationvalue,whilethelattertwohaveasingleparameterof
Skeletontype.
Whentheapplicationcannotfindaprimaryskeleton,thegameendsandentersthegameover
phase.ThishappenswhentheuserleavestheviewareaofKinect.WhenSimonisgivinginstructionsto
theuser,thegamehideshandcursors;otherwise,itupdatesthepositionofthehandcursors.Whenthe
gameisineitheroftheothertwophases,thenthegamecallsspecialprocessingmethodsbasedonthe
particulargamephase.
136
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-4.SkeletonFrameReadyEventHandler
private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
using(SkeletonFrame frame = e.OpenSkeletonFrame())
{
if(frame != null)
{
frame.CopySkeletonDataTo(this._FrameSkeletons);
Skeleton skeleton = GetPrimarySkeleton(this._FrameSkeletons);
if(skeleton == null)
{
ChangePhase(GamePhase.GameOver);
}
else
{
if(this._CurrentPhase == GamePhase.SimonInstructing)
{
LeftHandElement.Visibility = Visibility.Collapsed;
RightHandElement.Visibility = Visibility.Collapsed;
}
else
{
TrackHand(skeleton.Joints[JointType.HandLeft],
LeftHandElement, LayoutRoot);
TrackHand(skeleton.Joints[JointType.HandRight],
RightHandElement, LayoutRoot);
switch(this._CurrentPhase)
{
case GamePhase.GameOver:
ProcessGameOver(skeleton);
break;
case GamePhase.PlayerPerforming:
ProcessPlayerPerforming(skeleton);
break;
}
}
}
}
}
}
137
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
StartingaNewGame
TheapplicationhasasinglefunctionwhenintheGameOverphase:detectwhentheuserwantstoplaythe
game.Thegamestartswhentheplayerplacesherhandsintherespectivehandmarkers.Thelefthand
needstobewithinthespaceoftheLeftHandStartElementandtherighthandneedstobewithinthe
spaceoftheRightHandStartElement.Forthisproject,weuseWPF’sbuilt-inhittestingfunctionality.Our
UIissmallandsimple.ThenumberofUIelementsavailableforprocessinginanInputHitTestmethod
callisextremelylimited;therefore,therearenoperformanceconcerns.Listing5-5containsthecodefor
theProcessGameOvermethodandtheGetHitTargethelpermethod.TheGetHitTargetmethodisusedin
otherplacesintheapplication.
Listing5-5.DetectingWhentheUserIsReadytoStarttheGame
private void ProcessGameOver(SkeletonData skeleton)
{
//Determine if the user triggers to start of a new game
if(GetHitTarget (skeleton.Joints[JointType.HandLeft], LeftHandStartElement) != null &&
GetHitTarget (skeleton.Joints[JointType.HandRight], RightHandStartElement) != null)
{
ChangePhase(GamePhase.SimonInstructing);
}
}
private IInputElement GetHitTarget(Joint joint, UIElement target)
{
Point targePoint = GetJointPoint(this.KinectDevice, joint,
LayoutRoot.RenderSize, new Point());
targetPoint = LayoutRoot.TranslatePoint(targetPoint, target);
}
return target.InputHitTest(targetPoint);
ThelogicoftheProcessGameOvermethodissimpleandstraightforward:ifeachoftheplayer’shands
isinthespaceoftheirrespectivetargets,changethestateofthegame.TheGetHitTargetmethodis
responsiblefortestingifthejointisinthetargetspace.Ittakesinthesourcejointandthedesiredtarget,
andreturnsthespecificIInputElementoccupyingthecoordinatepointofthejoint.Whilethemethod
onlyhasthreelinesofcode,itisimportanttounderstandthelogicbehindthecode.
Ourhittestingalgorithmconsistsofthreebasicsteps.Thefirststepgetsthecoordinatesofthejoint
withinthecoordinatespaceoftheLayoutRoot.TheGetJointPointmethoddoesthisforus.Thisisthe
samemethodfromthepreviouschapter.CopythecodefromListing4-3andpasteitintothisproject.
Next,thejointpointintheLayoutRootcoordinatespaceistranslatedtothecoordinatespaceofthe
targetusingtheTranslatePointmethod.ThismethodisdefinedintheUIElementclass,ofwhichGrid
(LayoutRoot)isadescendent.Finally,withthepointtranslatedintothecoordinatespaceofthetarget,
wecalltheInputHitTestmethod,alsodefinedintheUIElementclass.Ifthepointiswithinthecoordinate
spaceofthetarget,theInputHitTestmethodreturnstheexactUIelementinthetarget’svisualtree.Any
non-nullvaluemeansthehittestwassuccessful.
ItisimportanttonotethatthesimplicityofthislogiconlyworksduetothesimplicityofourUI
layout.Ourapplicationconsumestheentirescreenandisnotmeanttoberesizable.Havingastaticand
fixedUIsizedramaticallysimplifiesthenumberofcalculations.Additionally,byusingCanvaselements
tocontainallinteractiveUIelements,weeffectivelyhaveasinglecoordinatespace.Byusingotherpanel
138
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
typestocontaintheinteractiveUIelementsorusingautomatedlayoutfeaturessuchasthe
HorizontalAlignment,VerticalAlignment,orMarginproperties,youpossiblyincreasethecomplexityof
hittestinglogic.Inshort,themorecomplicatedtheUI,themorecomplicatedthehittestinglogic,which
alsoaddsmoreperformanceconcerns.
ChangingGameState
Compileandruntheapplication.Ifallgoeswell,yourapplicationshouldlooklikeFigure5-6.The
applicationshouldtracktheplayer’shandmovementsandchangethegamephasefromGameOverto
SimonInstructingwhentheplayermoveshishandsintothestartposition.Thenexttaskistoimplement
theChangePhasemethod,asshowninListing5-6.ThiscodeisnotrelatedtoKinect.Infact,wecouldjust
aseasilyimplementedthissamegameusingtouchormouseinputandthiscodewouldstillberequired.
Figure5-6.StartinganewgameofSimonSays
ThefunctionofChangePhaseistomanipulatetheUItodenoteachangeinthegame’sstateand
maintainanydatanecessarytotracktheprogressofthegame.Specifically,theGameOverphasefadesout
theblocks,changesthegameinstructions,andpresentsthebuttonstostartanewgame.Thecodefor
theSimonInstructingphasegoesbeyondupdatingtheUI.Itcallstwomethods,whichgeneratethe
instructionsequence(GenerateInstructions),anddisplaystheseinstructionstotheplayer
(DisplayInstructions).FollowingListing5-6isthesourcecodeandfurtherexplanationforthese
methodsaswellasthedefinitionofthe_InstructionPositionmembervariables.
139
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-6.ControllingtheGameState
private void ChangePhase(GamePhase newPhase)
{
if(newPhase != this._CurrentPhase)
{
this._CurrentPhase = newPhase;
switch(this._CurrentPhase)
{
case GamePhase.GameOver:
this._CurrentLevel
RedBlock.Opacity
BlueBlock.Opacity
GreenBlock.Opacity
YellowBlock.Opacity
=
=
=
=
=
new game.";
GameStateElement.Text
ControlCanvas.Visibility
GameInstructionsElement.Text
0;
0.2;
0.2;
0.2;
0.2;
= "GAME OVER!";
= System.Windows.Visibility.Visible;
= "Place hands over the targets to start a
break;
case GamePhase.SimonInstructing:
this._CurrentLevel++;
GameStateElement.Text = string.Format("Level {0}", this._CurrentLevel);
ControlCanvas.Visibility
= System.Windows.Visibility.Collapsed;
GameInstructionsElement.Text
= "Watch for Simon's instructions";
GenerateInstructions();
DisplayInstructions();
break;
}
}
case GamePhase.PlayerPerforming:
this._InstructionPosition
GameInstructionsElement.Text
break;
= 0;
= "Repeat Simon's instructions";
}
PresentingSimon’sCommands
Listing5-7detailsanewsetofmembervariablesandtheGenerateInstructionsmethod.Themember
variable_InstructionSequenceholdsasetofUIElements,whichcompriseSimon’sinstructions.The
playermustmovehishandovereachUIElementinthesequenceorderdefinedbythearraypositions.
Theinstructionsetisrandomlychosen—eachlevelwiththenumberofinstructionsbasedonthecurrent
levelorround.Forexample,roundfivehasfiveinstructions.Alsoincludedinthiscodelistingisthe
DisplayInstructionsmethod,whichcreatesandthenbeginsastoryboardanimationtochangethe
opacityofeachblockinthecorrectsequence.
140
www.it-ebooks.info
4
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-7.GeneratingandDisplayingInstructions
private int _InstructionPosition;
private UIElement[] _InstructionSequence;
private Random rnd = new Random();
private void GenerateInstructions()
{
this._InstructionSequence = new UIElement[this._CurrentLevel];
for(int i = 0; i < this._CurrentLevel; i++)
{
switch(rnd.Next(1, 4))
{
case 1:
this._InstructionSequence[i] = RedBlock;
break;
case 2:
this._InstructionSequence[i] = BlueBlock;
break;
case 3:
this._InstructionSequence[i] = GreenBlock;
break;
}
}
case 4:
this._InstructionSequence[i] = YellowBlock;
break;
}
private void DisplayInstructions()
{
Storyboard instructionsSequence = new Storyboard();
DoubleAnimationUsingKeyFrames animation;
for(int i = 0; i < this._InstructionSequence.Length; i++)
{
animation = new DoubleAnimationUsingKeyFrames();
animation.FillBehavior = FillBehavior.Stop;
animation.BeginTime = TimeSpan.FromMilliseconds(i * 1500);
Storyboard.SetTarget(animation, this._InstructionSequence[i]);
Storyboard.SetTargetProperty(animation, new PropertyPath("Opacity"));
instructionsSequence.Children.Add(animation);
animation.KeyFrames.Add(new EasingDoubleKeyFrame(0.3,
KeyTime.FromTimeSpan(TimeSpan.Zero)));
animation.KeyFrames.Add(new EasingDoubleKeyFrame(1,
141
www.it-ebooks.info
6
CHAPTER5ADVANCEDSKELETONTRACKING
KeyTime.FromTimeSpan(TimeSpan.FromMilliseconds(500))));
animation.KeyFrames.Add(new EasingDoubleKeyFrame(1,
KeyTime.FromTimeSpan(TimeSpan.FromMilliseconds(1000))));
animation.KeyFrames.Add(new EasingDoubleKeyFrame(0.3,
KeyTime.FromTimeSpan(TimeSpan.FromMilliseconds(1300))));
}
instructionsSequence.Completed += (s, e) =>
{
ChangePhase(GamePhase.PlayerPerforming);
};
instructionsSequence.Begin(LayoutRoot);
}
Runningtheapplicationnow,wecanseetheapplicationstartingtocometogether.Theplayercan
startthegame,whichthencausesSimontobeginissuinginstructions.
DoingasSimonSays
Thefinalaspectofthegameistoimplementthefunctionalitytocapturetheplayeractingoutthe
instructions.NoticethatwhenthestoryboardcompletesanimatingSimon’sinstructions,the
applicationcallstheChangePhasemethodtotransitiontheapplicationintothePlayerPerformingphase.
ReferbacktoListing5-4,whichhasthecodefortheSkeletonFrameReadyeventhandler.Wheninthe
PlayerPerformingphase,theapplicationexecutestheProcessPlayerPerformingmethod.Onthesurface,
implementingthismethodshouldbeeasy.Thelogicissuchthataplayersuccessfullyrepeatsan
instructionwhenoneofhishandsentersthespaceofthetargetuserinterfaceelement.Essentially,this
isthesamehittestinglogicwealreadyimplementedtotriggerthestartofthegame(Listing5-5).
However,insteadoftestingagainsttwostaticUIelements,wetestforthenextUIelementinthe
instructionarray.AddthecodeinListing5-8totheapplication.Compileandrunit.Youwillquickly
noticethattheapplicationworks,butisveryunfriendlytotheuser.Infact,thegameisunplayable.Our
userinterfaceisbroken.
142
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-8.ProcessingPlayerMovementsWhenRepeatingInstructions
private void ProcessPlayerPerforming(Skeleton skeleton)
{
IInputElement leftTarget;
IInputElement rightTarget;
FrameworkElement correctTarget;
correctTarget = this._InstructionSequence[this._InstructionPosition];
leftTarget
= GetHitTarget(skeleton.Joints[JointType.HandLeft], GameCanvas);
rightTarget
= GetHitTarget(skeleton.Joints[JointType.HandRight], GameCanvas);
if(leftTarget != null && rightTarget != null)
{
ChangePhase(GamePhase.GameOver);
}
else if(leftTarget == null && rightTarget == null)
{
//Do nothing - target found
}
else if((leftHandTarget == correctTarget && rightHandTarget == null) ||
(rightHandTarget == correctTarget && leftHandTarget == null)
{
this._InstructionPosition++;
if(this._InstructionPosition >= this._InstructionSequence.Length)
{
ChangePhase(GamePhase.SimonInstructing);
}
}
else
{
}
ChangePhase(GamePhase.GameOver);
}
Beforebreakingdowntheflawsinthelogic,let’sunderstandessentiallywhatthiscodeattemptsto
accomplish.Thefirstlinesofcodegetthetargetelement,whichisthecurrentinstructioninthe
sequence.Thenthroughhittesting,itgetstheUIelementsatthepointsoftheleftandrighthand.The
restofthecodeevaluatesthesethreevariables.IfbothhandsareoverUIelements,thenthegameis
over.Ourgameissimpleandonlyallowsasingleblockatatime.WhenneitherhandisoveraUI
element,thenthereisnothingforustodo.Ifoneofthehandsmatchestheexpectedtarget,thenwe
incrementourinstructionpositioninthesequence.Theprocesscontinueswhiletherearemore
instructionsoruntiltheplayerreachestheendofthesequence.Whenthishappens,thegamephase
changestoSimonInstruction,andmovestheplayertothenextround.Foranyothercondition,the
applicationtransitionstotheGameOverphase.
Thisworksfine,aslongastheuserisheroicallyfast,becausetheinstructionpositionincrementsas
soonastheuserenterstheUIelement.TheuserisgivennotimetocleartheirhandfromtheUI
element’sspacebeforetheirhand’spositionisevaluatedagainstthenextinstructioninthesequence.It
isimpossibleforanyplayertogetpastleveltwo.Assoonastheplayersuccessfullyrepeatsthefirst
143
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
instructionofroundtwo,thegameabruptlyends.Thisobviouslyruinsthefunandchallengeofthe
game.
Wesolvethisproblembywaitingtoadvancetothenextinstructioninthesequenceuntilafterthe
user’shandhasclearedtheUIelement.Thisgivestheuseranopportunitytogetherhandsintoaneutral
positionbeforetheapplicationbeginsevaluatingthenextinstruction.Weneedtotrackwhentheuser’s
handentersandleavesaUIelement.
InWPF,eachUIElementobjecthaseventsthatfirewhenamouseentersandleavesthespaceofaUI
element,MouseEnterandMouseLeave,respectively.Unfortunately,asnoted,WPFdoesnotnatively
supportUIinteractionswithskeletonjointsproducedbyKinect.Thisprojectwouldbeawholeloteasier
ifeachUIElementhadeventsnameJointEnterandJointLeavethatfiredeachtimeaskeletonjoint
interactswithaUIElement.Sincewearenotaffordedthisluxury,wehavetowritethecodeourselves.
Implementingthesamereusable,elegant,andlow-leveltrackingofjointmovementsthatexistsforthe
mouseisnon-trivialandimpossibletomatchgiventheaccessibilityofcertainclassmembers.Thistype
ofdevelopmentisalsoquitebeyondthelevelandscopeofthisbook.Instead,wecodespecificallyfor
ourproblem.
Thefixforthegameplayproblemiseasytomake.Weaddacoupleofnewmembervariablesto
tracktheUIelementoverwhichtheplayer’shandlasthovered.Whentheplayer’shandentersthespace
ofaUIelement,weupdatethetrackingtargetvariable.Witheachnewskeletonframe,wecheckthe
positionoftheplayer’shand;ifitisdeterminedtohaveleftthespaceoftheUIelement,thenweprocess
theUIelement.Listing5-9showstheupdatedcodefortheProcessPlayerPerformingmethod.Thekey
changestothemethodareinbold.
Listing5-9.DetectingUsers’MovementsDuringGamePlay
private FrameworkElement _LeftHandTarget;
private FrameworkElement _RightHandTarget;
private void ProcessPlayerPerforming(Skeleton skeleton)
{
UIElement correctTarget
= this._InstructionSequence[this._InstructionPosition];
IInputElement leftTarget = GetHitTarget(skeleton.Joints[JointType.HandLeft], GameCanvas);
IInputElement rightTarget = GetHitTarget(skeleton.Joints[JointType.HandRight], GameCanvas);
if((leftTarget != this._LeftHandTarget) || (rightTarget != this._RightHandTarget))
{
if(leftTarget != null && rightTarget != null)
{
ChangePhase(GamePhase.GameOver);
}
else if((_LeftHandTarget == correctTarget && _RightHandTarget == null) ||
(_RightHandTarget == correctTarget && _LeftHandTarget == null))
{
this._InstructionPosition++;
if(this._InstructionPosition >= this._InstructionSequence.Length)
{
ChangePhase(GamePhase.SimonInstructing);
}
}
else if(leftTarget != null || rightTarget != null)
{
144
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
//Do nothing - target found
}
else
{
ChangePhase(GamePhase.GameOver);
}
if(leftTarget != this._LeftHandTarget)
{
AnimateHandLeave(this._LeftHandTarget);
AnimateHandEnter(leftTarget)
this._LeftHandTarget = leftTarget;
}
if(rightTarget != this._RightHandTarget)
{
AnimateHandLeave(this._RightHandTarget);
AnimateHandEnter(rightTarget)
this._RightHandTarget = rightTarget;
}
}
}
Withthesecodechangesinplace,theapplicationisfullyfunctional.Theretwonewmethodcalls,
whichexecutewhenupdatingthetrackingtargetvariable:AnimateHandLeaveandAnimateHandEnter.
Thesefunctionsonlyexisttoinitiatesomevisualeffectsignalingtotheuserthatshehasenteredorlefta
userinterfaceelement.Thesetypesofvisualcluesorindicatorsareimportanttohavingasuccessfuluser
experienceinyourapplication,andareyourstoimplement.Useyourcreativitytoconstructany
animationyouwant.Forexample,youcouldmimicthebehaviorofastandardWPFbuttonorchange
thesizeoropacityoftherectangle.
EnhancingSimonSays
ThisprojectisagoodfirststartinbuildinginteractiveKinectexperiences,butitcouldusesome
improvements.Therearethreeareasofimprovement:theuserexperience,gameplay,andpresentation.
Wediscusspossibleenhancements,butthedevelopmentisuptoyou.Grabfriendsandfamilyandhave
themplaythegame.Noticehowusersmovetheirarmsandreachforthegamesquares.Comeupwith
yourownenhancementsbasedontheseobservations.Makesuretoaskthemquestions,becausethis
feedbackisalwaysbeneficialtomakingabetterexperience.
UserExperience
Kinect-basedapplicationsandgamesareextremelynew,anduntiltheymature,buildinggooduser
experiencesconsistofmanytrialsandanextremenumberoferrors.Theuserinterfaceinthisproject
hasmuchroomforimprovement.SimonSaysuserscanaccidentlyinteractwithagamesquare,andthis
ismostobviousatthestartofthegamewhentheuserextendshishandstothegamestarttargets.Once
bothhandsarewithinthetarget,thegamebeginsissuinginstructions.Iftheuserdoesnotquicklydrop
hishands,hecouldaccidentlyhitoneofthegametargets.Onechangeistogivetheusertimetoresethis
handbyhisside,beforeissuinginstructions.Becausepeoplenaturallydroptheirhandstotheirside,an
easychangeissimplytodelayinstructionpresentationforanumberofseconds.Thissamedelayis
145
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
necessaryinbetweenrounds.Anewroundofinstructionsbeginsimmediatelyaftercompletingthe
instructionset.Theusershouldbegiventimetoclearhishandsfromthegametargets.
GamePlay
Thelogictogeneratetheinstructionsequenceissimple.Theroundnumberdeterminesthenumberof
instructions,andthetargetsarechosenatrandom.Intheoriginalgame,thenewroundaddedanew
instructiontotheinstructionsetofthepreviousround.Forexample,roundonemightbered.Round
twowouldberedthenblue.Roundthreewouldaddgreen,sotheinstructionsetwouldbered,blue,and
green.Anotherchangecouldbetonotincreasethenumberofinstructionsincrementallybyoneeach
round.Roundsonethroughthreecouldhaveinstructionssetsthatequaltheroundnumberinthe
instructioncount,butafterthat,theinstructionsetistwicetheroundnumber.Afunaspectofsoftware
developmentisthattheapplicationcodecanberefactoredsothatwecanhavemultiplealgorithmsto
generateinstructionsequences.Thegamecouldallowtheusertopickaninstructiongeneration
algorithm.Forsimplicity,thealgorithmscouldbenamedeasy,medium,orhard.Whilethebaselogicfor
generatinginstructionsequencesgetslongerwitheachround,theinstructionsdisplayataconstantrate.
Toincreasethedifficultyofthegameevenmore,decreasetheamountoftimetheinstructionisvisible
totheuserwhenpresentingtheinstructionset.
Presentation
Thepresentationofeachprojectinthisbookisstraightforwardandeasy.Creatingvisuallyattractiveand
amazingapplicationsrequiresmoreattentiontothepresentationthanisaffordedinthesepages.We
wanttofocusmoreonthemechanicsofKinectdevelopmentandlessonapplicationaesthetics.Itis
yourdutytomakegorgeousapplications.Withalittleeffort,youcanpolishtheUIoftheseprojectsto
makethemdazzleandengageusers.Forinstance,createniceanimationtransitionswhendelivering
instructions,andwhenauserentersandleavesatargetarea.Whenusersgetinstructionsetscorrect,
displayananimationtorewardthem.Likewise,haveananimationwhenthegameisover.Atthevery
least,createmoreattractivegametargets.Eventhesimplestgamesandapplicationscanbeengagingto
users.Anapplication’sallureandcharismacomesfromitspresentationandnotfromthegameplay.
ReflectingonSimonSays
Thisprojectillustratesbasicsofuserinteraction.Ittracksthemovementsoftheuser’shandsonthe
screenwithtwocursors,andperformshittestswitheachskeletonframetodetermineifahandhas
enteredorleftauserinterfaceelement.Hittestingiscriticaltouserinteractionregardlessoftheinput
device.SinceKinectisnotintegratedintoWPFasthemouse,stylus,ortouchare,Kinectdevelopershave
todomoreworktofullyimplementuserinteractionintotheirapplications.TheSimonSaysproject
servesasanexample,demonstratingtheconceptsnecessarytobuildmorerobustuserinterfaces.The
demonstrationisadmittedlyshallowandmoreispossibletocreatereusablecomponents.
Depth-BasedUserInteraction
Ourprojectsworkingwithskeletondatasofar(Chapter4and5)utilizeonlytheXandYvaluesofeach
skeletonjoint.However,theaspectofKinectthatdifferentiatesitfromallotherinputdevicesisnot
utilized.Eachjointcomeswithadepthvalue,andeveryKinectapplicationshouldmakeuseofthedepth
data.DonotforgettheZ.Thenextprojectexploresusesforskeletondata,andexaminesabasic
approachtointegratingdepthdataintoaKinectapplication.
146
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Withoutusingthe3DcapabilitiesofWPF,thereareafewwaystolayervisualelements,givingthem
depth.Thelayoutsystemultimatelydeterminesthelayoutorderofthevisualelements.Usingelements
ofdifferentsizesalongwithlayoutsystemlayeringgivestheillusionofdepth.Ournewprojectusesa
CanvasandtheCanvas.ZIndexpropertytosetthelayeringofvisualelements.Alternatively,itusesboth
manualsizingandaScaleTransformtocontroldynamicscalingforchangesindepth.Theuserinterface
ofthisprojectconsistsofanumberofcircles,eachrepresentingacertaindepth.Theapplicationtracks
theuser’shandswithcursors(handimages),whichchangeinscaledependingonthedepthoftheuser.
Theclosertheuseristothescreen,thelargerthecursors,andthefartherfromKinect,thesmallerthe
scale.
InVisualStudio,createanewprojectandaddthenecessaryKinectcodetohandleskeleton
tracking.UpdatetheXAMLinMainWindow.xamltomatchthatshowninListing5-10.MuchoftheXAMLis
commontoourpreviousprojects,orobviousadditionsbasedontheprojectrequirementsjust
described.ThemainlayoutpanelistheCanvaselement.ItcontainsfiveEllipsesalongwithan
accompanyingTextBlock.TheTextBlocksarelabelsforthecircles.Eachcircleisrandomlyplaced
aroundthescreen,butgivenspecificCanvas.ZIndexvalues.Adetailedexplanationbehindthevalues
comeslater.TheCanvasalsocontainstwoimagesthatrepresentthehandcursors.Eachdefinesa
ScaleTransform.Theimageusedforthescreenshotsisthatofarighthand.The-1ScaleXvalueflipsthe
imagetomakeitlooklikealefthand.
147
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-10.DeepUITargetsXAML
<Window x:Class="DeepUITargets.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:c="clr-namespace:DeepUITargets"
Title="Deep UI Targets"
Height="1080" Width="1920" WindowState="Maximized" Background="White">
<Window.Resources>
<Style x:Key="TargetLabel" TargetType="TextBlock">
<Setter Property="FontSize" Value="40"/>
<Setter Property="Foreground" Value="White"/>
<Setter Property="FontWeight" Value="Bold"/>
<Setter Property="IsHitTestVisible" Value="False"/>
</Style>
</Window.Resources>
<Viewbox>
<Grid x:Name="LayoutRoot" Width="1920" Height="1280">
<StackPanel HorizontalAlignment="Left" VerticalAlignment="Top">
<TextBlock x:Name="DebugLeftHand" Style="{StaticResource TargetLabel}"
Foreground="Black"/>
<TextBlock x:Name="DebugRightHand" Style="{StaticResource TargetLabel}"
Foreground="Black"/>
</StackPanel>
<Canvas>
<Ellipse x:Name="Target3" Fill="Orange" Height="200" Width="200"
Canvas.Left="776" Canvas.Top="162" Canvas.ZIndex="1040"/>
<TextBlock Text="3" Canvas.Left="860" Canvas.Top="206"
Panel.ZIndex="1040" Style="{StaticResource TargetLabel}"/>
<Ellipse x:Name="Target4" Fill="Purple" Height="150" Width="150"
Canvas.Left="732" Canvas.Top="320" Canvas.ZIndex="940"/>
<TextBlock Text="4" Canvas.Left="840" Canvas.Top="372" Panel.ZIndex="940"
Style="{StaticResource TargetLabel}"/>
<Ellipse x:Name="Target5" Fill="Green" Height="120" Width="120"
Canvas.Left="880" Canvas.Top="592" Canvas.ZIndex="840"/>
<TextBlock Text="5" Canvas.Left="908" Canvas.Top="590" Panel.ZIndex="840"
Style="{StaticResource TargetLabel}"/>
<Ellipse x:Name="Target6" Fill="Blue" Height="100" Width="100"
Canvas.Left="352" Canvas.Top="544" Canvas.ZIndex="740"/>
<TextBlock Text="6" Canvas.Left="368" Canvas.Top="582" Panel.ZIndex="740"
Style="{StaticResource TargetLabel}"/>
<Ellipse x:Name="Target7" Fill="Red" Height="85" Width="85" Canvas.Left="378"
Canvas.Top="192" Canvas.ZIndex=”640"/>
148
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
<TextBlock Text="7" Canvas.Left="422" Canvas.Top="226" Panel.ZIndex="640"
Style="{StaticResource TargetLabel}"/>
<Image x:Name="LeftHandElement" Source="Images/hand.png" Width="80"
Height="80" RenderTransformOrigin="0.5,0.5">
<Image.RenderTransform>
<ScaleTransform x:Name="LeftHandScaleTransform" ScaleY="1"
ScaleX="-1"/>
</Image.RenderTransform>
</Image>
<Image x:Name="RightHandElement" Source="Images/hand.png" Width="80"
Height="80" RenderTransformOrigin="0.5,0.5">
<Image.RenderTransform>
<ScaleTransform x:Name="RightHandScaleTransform" ScaleY="1"
ScaleX="1"/>
</Image.RenderTransform>
</Image>
</Canvas>
</Grid>
</Viewbox>
</Window>
Eachcirclerepresentsadepth.TheelementnamedTarget3,forexample,correspondstoadepthof
threefeet.ThewidthandheightofTarget3isgreaterthanTarget7,looselygivingasenseofscale.For
ourdemonstration,hardcodingthesevaluessuffices,butreal-worldapplicationswoulddynamically
scalebasedonthespecificapplicationconditions.Thecirclesaregivenuniquecolorstohelpfurther
distinguishonefromanother.
TheCanvaselementlayersthevisualelementsbasedontheCanvas.ZIndexvalues,suchthatthetopmostvisualelementistheonewiththelargestCanvas.ZIndexvalue.Iftwovisualelementshavethesame
Canvas.ZIndexvalue,theorderofdefinitionwithintheXAMLdictatestheorderoftheelements.The
Canvascontrolpositionselementssuchthatthelargeranelement’sZIndexvaluetheclosertothetop
theelementislayered,andthesmallerthenumberthefartherbackitislayered.Thismeanswecannot
assignZIndexvaluesbaseddirectlyonthedistanceofthevisualelement.Instead,invertingthedepth
valuesgivesthedesiredeffect.Themaximumdepthvalueis13.4feet.Consequently,ourCanvas.ZIndex
valuesrangefrom0to1340,wherethedepthvalueismultipliedby100forbetterprecision.Therefore,
theCanvas.ZIndexvalueforTarget5atadepthoffivefeetis840(13.5–5=8.4*100=840).
ThefinalnoteontheXAMLpertainstothetwoTextBlocksnamedDebugLeftHandand
DebugRightHand.Thesevisualelementsareusedtodisplayskeletondata,specificallythedepthvalueof
thehands.ItisquitedifficulttodebugKinectapplications,especiallywhenyouarethedeveloperand
thetestuser.Temporarilyaddingelementssuchasthesetoanapplicationhelpsdebugthecodewhen
traditionaldebuggingtechniquesfail.Additionally,thisinformationhelpstobetterillustratethepurpose
ofthisproject.
ThecodeinListing5-11handlestheprocessingontheskeletondata.TheSkeletonFrameReadyevent
handlerisnodifferentfrompreviousexamples,exceptforthecallstotheTrackHandmethod,usedin
previousprojects,whichismodifiedtohandlethescalingofthecursors.ThemethodconvertstheXand
Ypositionsfromtheskeletonspacetothecoordinatespaceofthecontainerandsetusingthe
Canvas.SetLeftandCanvas.SetTopmethods,respectively.TheCanvas.ZIndexiscalculatedaspreviously
described.
SettingtheCanvas.ZIndexisenoughtopropertylayerthevisualelements,butitfailstoprojecta
senseofperspectiveneededtoproducetheillusionofdepth.Withoutthisscaling,theapplicationfailsto
149
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
satisfytheuser.ItfailsasaKinectapplication,becausetheapplicationdoesnotdeliveranexperienceto
theuserthattheycannotgetfromotherinputdevice.Thescalingcalculationusedismoderately
arbitrary.Itissimpleenoughforthisprojecttodemonstratechangesindepthusingscale;however,for
otherapplicationsthisapproachistoosimple.
Forthebestuserexperience,thehandcursorsshouldscaletomeettherelativesizeoftheuser’s
hands.Thisproducesanillusionofthecursorbeinglikeagloveontheuser’shand.Itcreatesasubtle
bondbetweentheapplicationandtheuser,onethattheuserwillnotnecessarilybecognizantof,but
certainlywillcausetheusertointeractmorenaturallywiththeapplication.
Listing5-11.HandTrackingWithDepth
private void Runtime_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
using(SkeletonFrame skeletonFrame = e.OpenSkeletonFrame())
{
if(skeletonFrame != null)
{
skeletonFrame.CopySkeletonDataTo(this._FrameSkeletons);
Skeleton skeleton = GetPrimarySkeleton(this._FrameSkeletons);
if(skeleton != null)
{
TrackHand(skeleton.Joints[JointType.HandLeft], LeftHandElement,
LeftHandScaleTransform, LayoutRoot, true);
TrackHand(skeleton.Joints[JointType.HandRight], RightHandElement,
RightHandScaleTransform, LayoutRoot, false);
}
}
}
}
private void TrackHand(Joint hand, FrameworkElement cursorElement,
ScaleTransform cursorScale, FrameworkElement container, bool isLeft)
{
if(hand.TrackingState != JointTrackingState.NotTracked)
{
double z = hand.Position.Z * FeetPerMeters;
cursorElement.Visibility = System.Windows.Visibility.Visible;
Point cursorCenter = new Point(cursorElement.ActualWidth / 2.0,
cursorElement.ActualHeight / 2.0)
Point jointPoint = GetJointPoint(this.KinectDevice, hand,
container.RenderSize, cursorCenter);
Canvas.SetLeft(cursorElement, jointPoint.X);
Canvas.SetTop(cursorElement, jointPoint.Y);
Canvas.SetZIndex(cursorElement, (int) (1340 - (z * 100)));
cursorScale.ScaleX = 1340 / z * ((isLeft) ? -1 : 1);
cursorScale.ScaleY = 1340 / z;
if(hand.JointType == JointType.HandLeft)
{
150
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
}
else
{
DebugLeftHand.Text = string.Format("Left Hand: {0:0.00}", z * 10);
}
else
{
DebugRightHand.Text = string.Format("Right Hand: {0:0.00}", z * 10);
}
DebugLeftHand.Text = string.Empty;
DebugRightHand.Text = string.Empty;
}
}
MakesuretoincludetheGetJointPointcodefrompreviousprojects.Withthatcodeadded,compile
andruntheproject.Moveyourhandsaroundtomultipledepths.Thefirsteffectisimmediatelyobvious.
Thehandcursorsscaleaccordingtothedepthoftheuser’shand.Thesecondeffectoflayeringthevisual
objectsiseasywhenmakingbroaddramaticmovementsbackandforth.Watchthehandpositionvalues
inthedebugfieldschange,andusethisinformationtopositionyourhandeitherinfrontoforbehinda
depthmarker.TaketheimageinFigure5-7forexample.Therighthandisjustinfrontofthefour-foot
mark.ThecursorislayeredbetweenTarget3andTarget4,whiletherighthandisbeyondsixfeet.Figure
5-8showstheresultofbothhandsatroughlythesamedepth,betweenfiveandsixfeet,andthecursors
displayaccordingly.
Whilecrudeinpresentation,thisexampleshowstheeffectspossiblewhenusingdepthdata.When
buildingKinectapplications,developersmustthinkbeyondtheXandYplanes.VirtuallyallKinect
applicationscanincorporatedepthusingthesetechniques.Allaugmentedrealityapplicationsshould
employdepthintheexperience,otherwiseKinectisunderutilizedandthefullpotentialofthe
experiencegoesunfulfilled.Don’tforgettheZ!
151
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Figure5-7.Handsatdifferentdepths
152
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Figure5-8.Handsatnearlythesamedepth
Poses
Aposeisadistinctformofphysicalorbodycommunication.Ineverydaylife,peopleposeasan
expressionoffeelings.Itisatemporarypauseorsuspensionofanimation,whereone’spostureconveys
amessage.Commonlyinsports,refereesorumpiresuseposestosignalafouloroutcomeofanevent.In
football,refereessignaltouchdownsorfieldgoalsbyraisingtheirarmsabovetheirheads.Inbasketball,
refereesusethesameposetosignifyathree-pointbasket.Watchabaseballgameandpayattentionto
thethirdbasecoachorthecatcher.Bothuseaseriesofposestorelayamessagetothebatterand
pitcher,respectively.Posesinbaseball,wheresignalstealingiscommon,getcomplex.Ifacoachtouches
thebillofhishatandthenthebuckleofhisbelt,hemeansforthebaserunnertostealabase.However,
itmightbeadecoymessagewhenthecoachtouchesthebillofhishatandthenthetipofhisnose.
Posescanbeconfusedwithgestures,buttheyareinfacttwodifferentthings.Asstated,whena
personposes,sheholdsaspecificbodypositionorposture.Theimplicationisthatapersonremainsstill
whenposing.Agestureinvolvesaction,whileaposeisinert.Inbaseball,theumpiregesturestosignala
strikeout.Awaveisanotherexampleofgesture.Ontouchscreens,usersemploythepinchgestureto
zoomin.Stillanotherformofgestureiswhenapersonswipesbyflickingafingeracrossatouchscreen.
153
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Thesearegestures,becausethepersonisperforminganaction.Shakingafistwhenangryisagesture,
however,oneposeswhendisplayingtheirmiddlefingertoanother.
IntheearlylifeofKinectdevelopment,moreattentionanddevelopmentefforthasbeendirected
towardgesturerecognitionthanposerecognition.Thisisunfortunate,butunderstandable.The
marketingmessagesusedtosellKinect’sfocusonmovement.TheKinectnameitselfisderivedfromthe
wordkinetic,whichmeanstoproducemotion.Kinectissoldasatoolforplayinggameswhereyour
actions—yourgestures—controlthegame.Gesturescreatechallengesfordevelopersanduser
experiencedesigners.Asweexamineingreaterdetailinthenextchapter,gesturesarenotalwayseasy
foruserstoexecuteandcanbeextremelydifficultforanapplicationtodetect.However,posesare
deliberateactsoftheuser,whicharemoreconstantinformandexecution.
Whileposeshavereceivedlittleattention,theyhavethepotentialformoreextensiveuseinall
applications,evengames,thanatpresent.Generally,posesareeasierforuserstoperform,andmuch
easiertowritealgorithmstodetect.Thetechnicalsolutiontodetermineifapersonissignalinga
touchdownbyraisingtheirarmsabovetheirheadiseasiertoimplementthandetectingaperson
runninginplaceorjumping.
Imaginecreatingagamewheretheuserisflyingthroughtheair.Onewayofcontrollingthe
experienceistohavetheuserflaphisarmslikeabird.Themoretheuserflaps,thefasterheflies.That
wouldbeagesture.Anotheroptionistohavetheuserextendhisarmsawayfromhisbody.Themore
extendedthearms,thefastertheuserflies,andthecloserthearmsaretothebody,theslowertheyfly.In
SimonSays,theusermustextendhisarmsoutwardtotouchbothhandtargetsinordertostartthe
game.Analternativeoption,usingapose,istodetectwhenauserhasbotharmsextended.Thequestion
thenishowtodetectposes?
PoseDetection
Thepostureandpositionofauser’sbodyjointsdefineapose;morespecifically,itistherelationshipof
eachjointtoanother.Thetypeandcomplexityoftheposedefinesthecomplexityofthedetection
algorithm.Aposeisdetectablebyeitherintersectionorpositionofjointsortheanglebetweenjoints.
Detectingaposethroughintersectionisnotasinvolvedaswithangles,andthereforeprovidesagood
beginningtoposedetection.
Posedetectionthroughintersectionishittestingforjoints.Earlierinthechapter,wedetectedwhen
ajointpositionwaswithinthecoordinatespaceofavisualelement.Wedothesametypeoftestfor
joints.Thedifferencebeingitrequireslesswork,becausethejointsareinthesamecoordinatespaceand
thecalculationsareeasier.Forexample,takethehands-on-hippose.Skeletontrackingtellsusthe
positionoftheleftandrighthipjointsaswellastheleftandrighthandjoints.Usingvectormath,
calculatethelengthbetweenthelefthandandthelefthip.Ifthelengthoftwopointsislessthansome
variablethreshold,thenthehandsareconsideredtobeintersecting.Thethresholddistanceshouldbe
small.Testingforanexactintersectionofpoints,whiletechnicallypossiblecreatesapooruserinterface
justaswediscoveredwithvisualelementhittesting.TheskeletondatacomingbackfromKinectjitters
evenwithsmoothingparametersapplied,somuchsothatexactjointmatchesarevirtuallyimpossible.
Additionally,itisimpossibletoexpectausertomakesmoothandconsistentmovements,orevenholda
jointpositionforanextendedperiodtime.Inshort,theprecisionoftheuser’smovementsandthe
accuracyofthedataprecludethepracticalityofsuchasimplecalculation.Therefore,calculatingthe
lengthbetweenthetwopositionsandtestingforthelengthtobewithinathresholdistheonlyviable
solution.
Theaccuracyofthejointpositiondegradesfurtherwhentwojointsareintightproximity.It
becomesdifficultfortheskeletonenginetodeterminewhereonejointbeginsandanotherends.Test
thisbyhavingauserplaceherhandoverherface.Theheadpositionisroughlythepositionofone’s
nose.Thejointpositionofthehandandtheheadwillneverexactlymatch.Thismakescertainposes
indistinguishablefromothers,forexample.Itisimpossibletodetectthedifferencebetweenhandover
154
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
face,handontopofhead,andhandcoveringear.Thisshouldnotcompletelydiscourageapplication
designersanddevelopersfromusingtheseposes.Whileitisnotpossibletodefinitivelydeterminethe
exactpose,iftheuserisgivenpropervisualinstructionsbytheapplicationshewillperformthedesired
pose.
JointintersectionisnotrequiredtousebothXandYpositions.Certainposesaredetectableusing
onlyoneplane.Forexample,takeastandingplankposewheretheuserstandserectwithhisarmsflatby
hisside.Inthispose,theuser’shandsarerelativelyclosetothesameverticalplaneashisshoulders,
regardlessoftheuser’ssizeandshape.Forthispose,thelogicistotestthedifferenceoftheX
coordinatesoftheshoulderandhandjoints.Iftheabsolutedifferenceiswithinasmallthreshold,the
jointsareconsideredtobewithinthesameplane.However,thisdoesnotguaranteetheuserisinthe
standingplankpose.TheapplicationmustalsodetermineifthehandsarebelowtheshouldersontheYaxis.Thistypeoflogicproducesahighdegreeofaccuracy,butisstillnotperfect.Thereisnosimple
approachtodeterminingiftheuserisactuallystanding.Theusercouldbeonhiskneesorjusthavehis
kneesslightlybent,makingposedetectionaninexactscience.
Notallposesaredetectableusingjointintersectiontechniques,butthosethatarecanbedetected
moreaccuratelyusinganothertechnique.Take,forexample,aposewheretheuserextendsherarms
outward,awayfromthebodybutlevelwiththeshoulders.ThisiscalledtheTpose.Usingjoint
intersection,anapplicationcandetectifthehand,elbow,andshoulderareinthesamerelativeYplane.
Anotherapproachistocalculatetheanglebetweendifferentjointsinthebody.TheKinectSDK’s
skeletonenginedetectsuptotwentyskeletonpointsanytwoofwhichcanbeusedtoformatriangle.
Theanglesofthesetrianglesarecalculatedusingtrigonometry.
Fromtheskeletontrackingdata,wecandrawatriangleusinganytwojointpoints.Thethirdpoint
ofthetriangleisderivedfromtheothertwopoints.Knowingthecoordinatesofeachpointinthetriangle
meansthatweknowthelengthofeachside,butnoanglevalues.ApplyingtheLawofCosinesformula
givesusthevalueofanydesiredangle.TheLawofCosinesstatesthatc2=a2+b2-2abcosC,whereCisthe
angleoppositesidec.ThisformuladerivesfromthecommonlyknownPythagoreantheoremofc2=a2+
b2.Calculationsonthejointpointsgivethevaluesfora,b,andc.TheunknownisangleC.Transforming
theformulastosolvefortheunknownangleCyields:C=cos-1((a2+b2–c2)/2ab).Arccosine(cos-1)isthe
inverseofthecosinefunction,andreturnstheangleofaspecificvalue.
Figure5-9.LawofCosines
Todemonstrateposedetectionusingjointtriangulationconsidertheposewheretheuserisflexing
hisbicep.Inthispose,thearmfromtheshouldertotheelbowisroughlyparalleltothefloorwiththe
forearm(elbowtowrist)drawntotheshoulder.Inthispose,itiseasytoseetheformofarightoracute
triangle.Forrightandacutetriangles,wecanusebasictrigonometry,butnotforobtusetriangles.
Therefore,weusetheLawofCosinesasitworksforalltriangles.Usingitexclusivelykeepsthecode
155
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
cleanandsimple.Figure5-10showsaskeletoninthebicepflexposewithatriangleoverlaidtoillustrate
themath.
Figure5-10.Calculatingtheanglebetweentwojoints
Thefigureshowsthepositionofthreejoints:wrist,elbow,andshoulder.Thelengthsofthethree
sides(a,b,andc)arecalculatedfromthethreejointpositions.Plugginginthesidelengthstothe
transformedLawofCosinesequation,wesolveforangleC.Inthisexample,thevalueis93.875degrees.
Therearetwomethodsofjointtriangulation.Themostobviousapproachistousethreejointsto
formthethreepointsofthetriangle,asshowninthebicepposeexample.Theotherusestwojointswith
thethirdtrianglepointderivedinpartarbitrarily.Theapproachtousedependsonthecomplexityand
restrictionsofthepose.Inthisexample,weusethethree-jointmethod,becausethedesiredangleisthat
createdfromwristtoelbowtoshoulder.Theangleshouldalwaysbethesameregardlessoftheangle
betweenthearmandtorso(armpit)ortheangleofthetorsoandthehips.Tounderstandthis,stand
straightandflexyourbicep.Withoutmovingyourarmandforearmflexatthehiptotouchthesideof
yourkneewithyourother(non-flexed)hand.Theanglebetweenthewrist,elbow,andshoulderjointsis
thesame,buttheoverallbodyposeisdifferentbecausetheanglebetweenthetorsoandhipshas
changed.Ifthebicepflexposewasstrictlydefinedastheuserstandingstraightandthebicepflexed,
thenthethree-jointapproachinourexamplefailstovalidatethepose.
Toapplythetwo-jointmethodtothebicepflexpose,useonlytheelbowandthewristjoints.The
elbowbecomesthecenterorzeropointofthecoordinatesystem.Thewristpositionestablishesthe
definingpointoftheangle.ThethirdpointofthetriangleisanyarbitrarypointalongtheX-axisofthe
elbowpoint.TheYvalueofthethirdpointisalwaysthesameasthezeropoint,whichinthiscaseisthe
elbow.Inthetwo-jointmethod,thecalculatedangleisdifferentwhentheuserisstandingstraightas
opposedtowhenleaning.
ReactingtoPoses
UnderstandinghowtodetectposesonlysatisfiesthetechnicalsideofKinectapplicationdevelopment.
Whattheapplicationdoeswiththisinformationandhowitcommunicateswiththeuserisequally
criticaltotheapplicationfunctioningwell.Thepurposeofdetectingposesistoinitiatesomeaction
fromtheapplication.Thesimplestapproachforanyapplicationistotriggeranactionimmediatelyupon
detectingthepose,similartoamouseclick.
156
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
WhatmakesKinectfascinatingandcoolisthattheuseristheinputdevice,butthisalsointroduces
newproblems.Themostchallengingproblemfordevelopersanddesignersisthattheuseristheinput
device,becauseusersdonotalwaysactasdesiredandexpected.Fordecades,developersanddesigners
havebeenworkingtoimprovekeyboard-andmouse-drivenapplicationssoapplicationsarerobust
enoughtohandleanythingusesthrowsatthem.Mostofthetechniqueslearnedforkeyboardandmouse
inputdonotapplytoKinect.Whenusingamouse,theusermustdeliberatelyclickthemousebuttonto
executeanaction—well,mostofthetimethemouseclickisdeliberate.Whenmouseclicksare
accidental,thereisnowayforanapplicationtoknowthatitwasbyaccident,butbecausetheuseris
requiredtopushthebutton,accidentshappenlessoften.Withposedetection,thisisnotalwaysthe
case,becauseusersareconstantlyposing.
Applicationsusingposedetectionmustknowwhentoignoreandwhentoreacttoposes.Asstated,
theeasiestapproachisfortheapplicationtoreactimmediatelytothepose.Ifthisisthedesiredfunction
oftheapplication,choosedistinctposesthatausermaynotnaturallyreverttowhenrestingorrelaxing.
Chooseposesthatareeasytoperform,butarenotnaturalorcommoningeneralhumanmovement.
Thisrequirestheposetobeamoredeliberateactionmuchlikethemouseclick.Insteadofimmediately
reactingtothepose,analternativeistostartatimer.Theapplicationthenreactsonlyiftheuserholds
theposeforaspecificduration.Thisisarguablyconsideredagesture.Wewilldeferdivingdeeperinto
thatargumentuntilnextchapter.
Anotherapproachtorespondingtouserposesistouseasequenceofposestotriggeranaction.This
requirestheusertoperformanumberofposesinaspecificsequencebeforetheapplicationexecutesan
action.Thinkbacktothebaseballexample,wherethecoachisgivingasetofsignalstotheplayers.
Thereisalwaysoneposethatisanindicatorthattheposethatfollowsisacommand.Ifthecoach
toucheshisnoseandthenhisbeltbuckle,therunneronfirstshouldstealsecondbase.However,ifthe
coachtoucheshisearandthenthebeltbuckle,thismeansnothing.Thetouchingofthenoseisthe
indicatorthatthenextposeisacommandtofollow.Usingasequenceofposesalongwiththe
uncommonposturingclearlyindicatesthattheuserverypurposelydesirestheapplicationtoexecutea
specificaction.Inotherwords,theuserislesslikelytoaccidentlytriggeranundesiredaction.
SimonSaysRevisited
LookingbackontheSimonSaysproject,let’sredoit,butinsteadofusingvisualelementhittesting,we
willuseposes.Inoursecondversion,Simoninstructstheplayertoposeinaspecificsequenceinsteadof
touchingtargets.Detectingposesusingjointanglesgivestheapplicationthegreatestrangeofposes.
Themoreposesavailableandthecrazierthepose,themorefuntheplayerhas.Ifyourapplication
experienceisfun,thenitisasuccess.
TipThisversionofSimonSaysmakesafundrinkinggame,butonlyifyouareoldenoughtodrink!Please
drinkresponsibly.
Usingposesinplaceofvisualtargetsrequireschangingalargeportionoftheapplication,butnotin
abadway.Thecodenecessarytodetectposesislessthanthatneededtoperformhittestingand
determiningifahandhasenteredorleftthevisualelement’sspace.Theposedetectioncodefocuseson
usingmath,specificallytrigonometry.Besidesthechangestothecode,therearechangestotheuser
experienceandgameplay.Alloftheblandboxesgoaway.Theonlyvisualelementsleftarethe
TextBlocksandtheelementsforthehandcursors.Wewillneedsomewayoftellingtheuserwhatpose
toperform.Thebestapproachtothisistocreategraphicsorimagesshowingtheexactshapeofthe
157
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
pose.Understandably,noteveryoneisagraphicdesignerorhasaccesstoone.Aquickanddirtywayis
todisplaythenameoftheposeintheinstructionsTextBlock,whichisgoingtobeourapproach.This
worksfordebuggingandtesting,andbuysyouenoughtimetomakefriendswithagraphicdesigner.
Thegameplaychanges,too.Removingthevisualelementhittestingmeanswehavetocreatea
completelynewapproachtostartingthegame.Thisiseasy.Wemaketheuserpose!Thestartposefor
thenewSimonSayswillbethesameasbefore.Inthefirstversion,theuserextendedherarmstohitthe
twotargets.ThisisaTpose,becausetheplayer’sbodyresemblestheletterT.ThenewversionofSimon
SaysstartsanewgamewhenitdetectstheuserinaTpose.
InthepreviousSimonSays,theinstructionsequencepointeradvancedwhentheusersuccessfully
hitthetarget,orthegameendediftheplayerhitanothertarget.Inthisversion,theplayerhasalimited
timetoreproducethepose.Iftheuserfailstoposecorrectlyintheallottedtime,thegameisover.Ifthe
poseisdetected,thegamemovestothenextinstructionandthetimerrestarts.
Beforewritinganygamecode,wemustbuildsomeinfrastructure.Forthegametobefun,itneeds
tobecapableofdetectinganynumberofposes.Additionally,itmustbecapableofeasilyaddingnew
posestothegame.Tofacilitatecreatingaposelibrary,createanewclassnamedPoseAngleanda
structurenamedPose.ThecodeisshowninListing5-12.ThePosestructuresimplyholdsanameandan
arrayofPoseAngleobjects.Thedecisiontouseastructureinsteadofaclassisforsimplicityonly.The
PoseAngleclassholdstwoJointTypesnecessarytocalculatetheangle,therequiredanglebetweenthe
jointsandathresholdvalue.Justaswithvisualelementhittesting,wewillnotrequiretheusertoever
absolutelymatchtheangle,asthisisimpossible.Aswithvisualelementhittesting,weonlyrequirethe
usertobewithinarangeofangle.Theuserisrequiredtobeplusorminusthethresholdfromtheangle.
Listing5-12.ClassestoStorePoseInformation
public class PoseAngle
{
public PoseAngle(JointType centerJoint, JointType angleJoint,
double angle, double threshold)
{
CenterJoint = centerJoint;
AngleJoint = angleJoint;
Angle
= angle;
Threshold
= threshold;
}
public
public
public
public
JointType CenterJoint { get; private set;}
JointType AngleJoint { get; private set;}
double Angle { get; private set;}
double Threshold { get; private set;}
}
public struct Pose
{
public string Title;
public PoseAngle[] Angles;
}
Withthenecessarycodeinplacetostoreposeconfiguration,wewritethecodetocreatethegame
poses.InMainWindow.xaml.cs,createnewmembervariables_PoseLibraryand_StartPose,anda
methodnamedPopulatePoseLibrary.ThiscodeisshowninListing5-13.ThePopulatePoseLibrary
158
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
methodcreatesthedefinitionofthestartpose(Tpose)andtwoposestobeusedduringgameplay.The
firstgameposetitled“TouchDown”resemblesafootballrefereesignalingatouchdown.Theother
gameposetitled“Scarecrow”istheinverseofthefirst.
Listing5-13.CreatingaLibraryofPoses
private Pose[] _PoseLibrary;
private Pose _StartPose;
private void PopulatePoseLibrary()
{
this._PoseLibrary = new Pose[2];
PoseAngle[] angles;
//Start Pose - Arms Extended (Touch Down!)
this._StartPose
= new Pose();
this._StartPose.Title
= "Start Pose";
angles
= new PoseAngle[4];
angles[0] = new PoseAngle(JointType.ShoulderLeft, JointType.ElbowLeft, 180, 20);
angles[1] = new PoseAngle(JointType.ElbowLeft, JointType.WristLeft, 180, 20);
angles[2] = new PoseAngle(JointType.ShoulderRight, JointType.ElbowRight, 0, 20);
angles[3] = new PoseAngle(JointType.ElbowRight, JointType.WristRight, 0, 20);
this._StartPose.Angles = angles;
//Pose 1 - Both Hands Up (Touch Down)
this._PoseLibrary[0]
= new Pose();
this._PoseLibrary[0].Title = "Touch Down!";
angles
= new PoseAngle[4];
angles[0] = new PoseAngle(JointType.ShoulderLeft, JointType.ElbowLeft, 180, 20);
angles[1] = new PoseAngle(JointType.ElbowLeft, JointType.WristLeft, 90, 20);
angles[2] = new PoseAngle(JointType.ShoulderRight, JointType.ElbowRight, 0, 20);
angles[3] = new PoseAngle(JointType.ElbowRight, JointType.WristRight, 90, 20);
this._PoseLibrary[1].Angles = angles;
//Pose 2 - Both Hands Down (Scarecrow)
this._PoseLibrary[1]
= new Pose();
this._PoseLibrary[1].Title = "Scarecrow";
angles
= new PoseAngle[4];
angles[0] = new PoseAngle(JointType.ShoulderLeft, JointType.ElbowLeft, 180, 20);
angles[1] = new PoseAngle(JointType.ElbowLeft, JointType.WristLeft, 270, 20);
angles[2] = new PoseAngle(JointType.ShoulderRight, JointType.ElbowRight, 0, 20);
angles[3] = new PoseAngle(JointType.ElbowRight, JointType.WristRight, 270, 20);
this._PoseLibrary[1].Angles = angles;
}
Withthenecessaryinfrastructureinplace,weimplementthechangestothegamecode,startingby
detectingthestartofthegame.WhenthegameisintheGameOverstate,theProcessGameOvermethodis
continuallycalled.Thepurposeofthismethodwasoriginallytodetectiftheplayer’shandswereover
159
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
thestarttargets.Thiscodeisreplacedwithcodethatdetectsiftheuserisinaspecificpose.Listing5-14
detailsthecodetostartthegameplayandtodetectapose.Itisnecessarytohaveasinglemethodthat
detectsaposematch,becauseweuseitinmultipleplacesinthisapplication.Also,notehow
dramaticallylesscodeisintheProcessGameOvermethod.
ThecodetoimplementtheIsPosemethodisstraightforwarduntilthelastfewlines.Thecodeloops
throughthePoseAnglesdefinedintheposeparameter,calculatingthejointangleandvalidatingthe
angleagainsttheangledefinedbythePoseAngle.IfanyPoseAnglefailstovalidatetheIsPose,themethod
returnsfalse.TheifstatementteststoensurethattheanglerangedefinedbytheloAngleandhiAngle
valuesisnotoutsideofthedegreerangeofacircle.Ifthevaluesfalloutsideofthisrange,adjustbefore
validating.
Listing5-14.UpdatedProcessGameOver
private void ProcessGameOver(Skeleton skeleton)
{
if(IsPose(skeleton, this._StartPose))
{
ChangePhase(GamePhase.SimonInstructing);
}
}
private bool IsPose(Skeleton skeleton, Pose pose)
{
bool isPose = true;
double angle;
double poseAngle;
double poseThreshold;
double loAngle;
double hiAngle;
for(int i = 0; i < pose.Angles.Length && isPose; i++)
{
poseAngle
= pose.Angles[i].Angle;
poseThreshold
= pose.Angles[i].Threshold;
angle
= GetJointAngle(skeleton.Joints[pose.Angles[i].CenterJoint],
skeleton.Joints[pose.Angles[i].AngleJoint]);
hiAngle = poseAngle + poseThreshold;
loAngle = poseAngle - poseThreshold;
if(hiAngle >= 360 || loAngle < 0)
{
loAngle = (loAngle < 0) ? 360 + loAngle : loAngle;
hiAngle = hiAngle % 360;
isPose = !(loAngle > angle && angle > hiAngle);
}
else
{
isPose = (loAngle <= angle && hiAngle >= angle);
160
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
}
}
}
return isPose;
TheIsPosemethodcallstheGetJointAnglemethodtocalculatetheanglebetweenthetwojoints.It
callstheGetJointPointmethodtogetthepointsofeachjointinthemainlayoutspace.Thisstepis
technicallyunnecessary.Therawpositionvaluesofthejointsareallthatisneededtocalculatethejoint
angles.However,convertingthevaluestothemainlayoutcoordinatesystemhelpswithdebugging.With
thejointpositions,regardlessofthecoordinatespace,themethodthenimplementstheLawofCosines
formulatocalculatetheanglebetweenthejoints.WPF’sarccosinemethod(Math.Acos())returnsvalues
inradians,makingitnecessaryforustoconverttheanglevaluetodegrees.Thefinalifhandlesangles
between180-360degrees.TheLawofCosinesformulaonlyworksforanglebetween0and180.Theif
blockisnecessarytoadjustvaluesforanglesfallingintothethirdandfourthquadrantsofthegraph.
Listing5-15.CalculatingtheAngleBetweenTwoJoints
private double GetJointAngle(Joint zeroJoint, Joint angleJoint)
{
Point zeroPoint
= GetJointPoint(zeroJoint);
Point anglePoint
= GetJointPoint(angleJoint);
Point x
= new Point(zeroPoint.X + anglePoint.X, zeroPoint.Y);
double a;
double b;
double c;
a = Math.Sqrt(Math.Pow(zeroPoint.X - anglePoint.X, 2) +
Math.Pow(zeroPoint.Y - anglePoint.Y, 2));
b = anglePoint.X;
c = Math.Sqrt(Math.Pow(anglePoint.X - x.X, 2) + Math.Pow(anglePoint.Y - x.Y, 2));
double angleRad = Math.Acos((a * a + b * b - c * c) / (2 * a * b));
double angleDeg = angleRad * 180 / Math.PI;
if(zeroPoint.Y < anglePoint.Y)
{
angleDeg = 360 - angleDeg;
}
}
return angleDeg;
Thecodeneededtodetectposesandstartthegameisinplace.Whenthegamedetectsthestart
pose,ittransitionsintotheSimonInstructingphase.Thecodechangesforthisphaseareisolatedtothe
GenerateInstructionsandDisplayInstructsmethods.TheupdatesforGenerateInstructionsare
relativelythesame:populatetheinstructionsarraywitharandomlyselectedposefromtheposelibrary.
TheDisplayInstructionsmethodisanopportunitytogetcreativeinthewayyoupresentthesequence
ofinstructionstotheplayer.Wewillleavetheseupdatestoyou.
Oncethegamecompletesthepresentationofinstructions,ittransitionstothePlayerPerforming
stage.Theupdatedgamerulesgivetheuseralimitedtimetoperformtheinstructedpose.Whenthe
161
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
applicationdetectstheuserintherequiredpose,itadvancestothenextposeandrestartsthetimer.If
thetimergoesoffbeforetheplayerreproducesthepose,thegameends.WPF’sDispatcherTimermakes
iteasytoimplementthetimerfeature.TheDispatcherTimerobjectisintheSystem.Windows.Threading
namespace.ThecodetoinitializeandhandlethetimerexpirationisinListing5-16.Createanew
membervariable,andaddthecodeinthelistingtotheMainWindowconstructor.
Listing5-16.TimerInitialization
this._PoseTimer
this._PoseTimer.Interval
this._PoseTimer.Tick
this._PoseTimer.Stop();
= new DispatcherTimer();
= TimeSpan.FromSeconds(10);
+= (s, e) => { ChangePhase(GamePhase.GameOver); };
ThefinalcodeupdatenecessarytouseposesinSimonSaysisshowninListing5-17.Thislisting
detailsthechangestotheProcessPlayerPerformingmethod.Oneachcall,itvalidatesthecurrentposein
thesequencewiththeplayer’sskeletonposture.Ifthecorrectposeisdetected,itstopsthetimerand
movestothenextposeinstructioninthesequence.Thegamechangestotheinstructingphasewhenthe
playerreachestheendofthesequence.Otherwise,thetimerrefreshedforthenextpose.
Listing5-17.UpdatedProcessPlayerPerformingMethod
private void ProcessPlayerPerforming(Skeleton skeleton)
{
int instructionSeq = this._InstructionSequence[this._InstructionPosition];
if(IsPose(skeleton, this._PoseLibrary[instructionSeq]))
{
this._PoseTimer.Stop();
this._InstructionPosition++;
}
if(this._InstructionPosition >= this._InstructionSequence.Length)
{
ChangePhase(GamePhase.SimonInstructing);
}
else
{
this._PoseTimer.Start();
}
}
Withthiscodeaddedtotheproject,SimonSaysdetectsposesinplaceofvisualelementhittesting.
Thisprojectisapracticalexampleofposedetectionandhowtoimplementitinanapplication
experience.Withtheinfrastructurecodeinplace,createnewposesandaddthemtothegame.Make
suretoexperimentwithdifferenttypesofposes.Youwilldiscoverthatnotallposesareeasilydetectable
anddonotworkwellinKinectexperiences.
Aswithanyapplication,butuniquelysoforaKinect-drivenapplication,theuserexperienceis
criticaltoasuccessfulapplication.AfterthefirstrunofthenewSimonSays,itismarkedlyobviousthat
muchismissingfromthegame.Theuserinterfacelacksmanyelementsnecessarytomaketheuser
interfaceeffectiveorevenafungame.Havingafunexperienceisthepointafterall.Thegamelacksany
userfeedback,whichisparamounttoasuccessfuluserexperience.ForSimonSaystobecomeatrue
Kinect-drivenexperience,itmustprovidetheuservisualcueswhenthegamestartsandends.The
162
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
applicationshouldrewardplayerswithavisualeffectwhentheysuccessfullyperformposes.Thetypeof
feedbackandhowitlooksisforyoutodecide.Becreative!Makethegameentertainingtoplayand
visuallystriking.Hereareafewotherideasforenhancements:
•
Createmoreposes.AddingnewposesiseasytodousingthePoseclass.The
infrastructureisinplace.Allyouneedtodoisdeterminetheanglesofthejoints
andbuildthePoseobjects.
•
Adjustthegameplaybyspeedinguptheposetimereachround.Thismakesthe
usermoreactiveandengagedinthegame.
•
Applymorepressurebydisplayingthetimerintheuserinterface.Showingthe
timeronthescreenappliesstresstotheuser,butinaplayfulmanner.Adding
visualeffectstothescreenorthetimerasitcloselyapproacheszeroaddsfurther
pressure.
•
Takeasnapshot!AddthecodefromChapter2totakesnapshotsofuserswhile
theyareinposes.Attheendofthegame,displayaslideshowofthesnapshots.
Thiscreatesatrulymemorablegamingexperience.
ReflectandRefactor
Lookingbackonthischapter,themostreusablecodeistheposedetectioncodefromtherevisedSimon
Saysproject.Inthatproject,wewroteenoughcodetostartaposedetectionengine.Itisnotridiculousto
speculatethatafutureversionoftheMicrosoftKinectSDKwillincludeaposedetectionengine,butthis
isabsentfromthecurrentversion.GiventhatMicrosofthasnotprovidedanyindicationofthefuture
featuresintheKinectSDK,itisworthwhiletocreatesuchatool.Therehavebeensomeattemptsto
createsimilartoolsbytheonlineKinectdevelopercommunity,butsofar,nonehasemergedasthe
standard.
Forthosewhoareindustriousandwillingtobuildtheirownposeengine,imagineaclassnamed
PoseEngine,whichhasasingleeventnamedPoseDetected.Thiseventfireswhentheenginedetectsthat
askeletonhasperformedapose.Bydefault,thePoseEnginelistenstoSkeletonFrameReadyevents,but
wouldalsohaveameanstomanuallytestforposesonaframe-by-framebasis,makingitserviceable
underapollingarchitecture.TheclasswouldholdacollectionofPoseobjects,whichdefinethe
detectableposes.UsingtheAddandRemovemethods,similartoa.NETList,adeveloperdefinesthepose
libraryfortheapplication.
Tofacilitateaddingandremovingposesatruntime,theposedefinitionscannotbehard-codedlike
theyareintheSimonSaysproject.Thesimplicityoftheseobjectsmeansserializationisstraightforward.
Serializingtheposedataprovidestwoadvantages.Thefirstisthatposesaremoreeasilyaddedand
removedfromanapplication.Applicationscanreadposesfromconfigurationwhentheapplication
loadsordynamicallyaddsnewposesduringtheapplication’sruntime.Further,theabilitytopersist
poseconfigurationmeanswecanbuildtoolstocreateposeconfigurationbycapturingorrecording
poses.
It’seasytoenvisionatooltocaptureandserializetheposesforapplicationuse.ThistoolisaKinect
applicationthatusesallofthetechniquesandknowledgepresentedthusfar.TakingtheSkeletonViewer
controlcreatedinthepreviouschapter,addthejointanglecalculationlogicfromSimonSays.Update
theoutputoftheSkeletonViewertodisplaytheanglevalueanddrawanarctoclearlyillustratethejoint
angle.Theposecapturetoolwouldthenhaveafunctiontotakeasnapshotoftheuserposes,asnapshot
beingnothingmorethanarecordingofthevariousjointangles.Eachsnapshotisserialized,makingit
easytoaddtoanyapplication.
163
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
AmuchquickersolutionistoupdatetheSkeletonViewercontroltodisplaythejointangles.Figure
5-11showswhattheoutputmightbe.Thisallowsyoutoquicklyseetheanglesofthejoints.Pose
configurationcanbemanuallycreatedthisway.Evenwithaposedetectionengineandposebuilder
tool,updatingtheSkeletonViewertoincludejointanglesbecomesavaluabledebuggingtool.
164
www.it-ebooks.info
1
CHAPTER5ADVANCEDSKELETONTRACKING
Figure5-11.Illustratingtheanglechangebetweentherightelbowandrightwrist
165
www.it-ebooks.info
CHAPTER5ADVANCEDSKELETONTRACKING
Summary
TheKinectpresentsanewandexcitingchallengetodevelopers.Itisanewformofuserinputintoour
applications.Eachnewinputdevicehascommonalitieswithotherdevices,butalsohasuniversially
uniquefeatures.Inthischapter,weintroducedWPF’sinputsystemandhowKinectinputissimilarto
themouseandtouchdevices.Theconversationcoveredtheprimitivesofuserinterfaceinteractions
specificallyhittesting,andweprovidedademonstrationoftheseprincipleswiththeSimonSaysgame.
FromthereweexpandedthediscussiontoillustratehowtheunqiuefeatureofKinect(Zdata)canbe
usedinanapplication.
Weconcludedthechapterbyintroducingtheconceptofaposeandhowtodetectposes.This
includedupdatingtheSimonSaysgametouseposestodrivegameplay.Posesareauniquewayfora
usertocommunicateanactionorseriesofactionstotheapplication.Understandinghowtodetect
posesisjustthebeginning.Thenextphaseisdefiningwhataposeis,followedbyagreeingoncommon
posesandstandardizingposenames.Themorefundamentallyimportantpartoftheposechallengeis
determininghowtoreactoncetheposeisdetected.Thishastechnicalaswellasdesignimplications.
Thetechnicalconsiderationsaremoreeasilyaccomplished,requiringtoolsforprocessingskeletondata
torecognizeposesandtonotifytheuserinterface.Ideally,thistypeofbehaviorwouldbeintegratedinto
WPForattheveryleastincludedintheKinectforWindowsSDK.
OneofthestrengthsofWPFthatdistinguishesitaboveallotherapplicationplatformsisits
integrationofinputdevicecontrols,andstyle,andtemplateengine.ProcessingKinectskeletondataas
aninputdeviceisnotanaturalfunctionofWPF.Thisfallstothedeveloper,whohastorewritemuchof
thelow-levelcodeWPFalreadyhasforotherinputdevices.ThehopeisthatsomedayMicrosoftwill
integrateKinectintoWPFasanativeinputdevice,freeingdevelopersfromtheburdenofmanually
reproducingthiseffort,sotheycanfocusonbuildingexciting,fun,andengagingKinectexperiences.
166
www.it-ebooks.info
CHAPTER 6
Gestures
GesturesarecentraltoKinectjustasclicksarecentraltoGUIplatformsandtapsarecentraltotouch
interfaces.Unlikethedigitalinteractionidiomsofthegraphicaluserinterface,gesturesarepeculiarin
thattheyalreadyexistintherealworld.Withoutcomputerswewouldhavenoneedforthemouse.
Gestures,ontheotherhand,areabasicpartofeverydaycommunication.Theyareusedtoenhanceour
speechandprovideemphasisaswellastoindicatemood.Gestureslikewavingandpointingareusedin
theirownrightasaformofunarticulatedspeech.Thevocabularyofgesturesissoplentifulthatweeven
haveasubclasswequalifyas“obscene.”Needlesstosay,thereisnosuchthingasanobsceneclick.
ThetaskforKinectdesignersanddevelopersgoingforwardistomapreal-worldgesturesto
computerinteractionsinwaysthatmakesense.Sincegesture-basedhuman-computerinteractioncan
seemliketerraincognitaforpeoplenewtoit,thereisagreattemptationtosimplytrytoportexisting
interactionsfrommouse-basedGUIdesignortouch-basedNUIdesign.Whileitcannotalwaysbe
avoided,pleaseresistthistemptation.Ascomputerengineersandinteractiondesignersworktogetherto
createanewgesturevocabularyforKinect,wecandrawontheworkofresearcherswhohavebeen
playingwiththeseconceptsforover30yearsaswellasthebrilliantinnovationsofgamingcompanies
likeHarmonixoverthepastyearastheyhavebroughtKinectgamestomarket.
Inthischapter,wewillexaminesomeoftheconceptsbehinduserexperiencesandseehowthey
applytogesturesforKinect.WewillshowhowKinectfitsintothebroadermodelofhuman-computer
interactionknownasnaturaluserinterfaces(NUI).Wewillalsolookatconcreteexamplesofgestures
usedforinteractingwithKinectandshowhowthesevarioustheoriesinformthem(ordonot).Most
importantly,wewillshowthereaderhowtoimplementsomeofthegesturesthathavealreadybecome
partofthestandardKinectgesturevocabulary.
DefiningaGesture
The“gesture”hasbecomeaspecializedtermforsomanydifferentdisciplinesthatitbucklesunderthe
weightofthevariousoverlappingandsometimesconflictingmeaningsascribedtoit.Itisfascinatingto
linguistsbecauseitmarksthelowerthresholdoftheirfieldofstudy;itistheunutteredversionofspoken
languageandisconsideredbysometobeaproto-language.Insemiotics(thestudyofsigns),gestures
arejustoneofmanythingsthatcansignifyorstandinforotherthings,suchaswords,images,myths,
rituals,mathematicalformulas,maps,andtealeaves.
Inthearts,“gesture”isusedtodescribethemostexpressiveaspectsofdance,especiallyinAsian
dancetechniqueswherehandposesarecategorizedandgivenreligioussignificance.Inthefinearts,
brushstrokesaredescribedasgesturesandthecautiousstrokeworkofaVermeeriscontrastedwiththe
broad“gestures”ofaVanGoghoraJacksonPollock.Finally,ininteractivedesign,“gestures”are
distinguishedfrom“manipulations”intouch-basedNUIexperiences.
Adictionaryapproachtocopingwithallthisvarietywouldsimplyinvolvepointingoutthattheterm
hasmanymeaningsanddelineatingwhatthosemultipledenotationsare.Inacademiccircles,however,
167
www.it-ebooks.info
CHAPTER6GESTURES
theprevailingtendencyistocreateanabstractdefinitionthatcaptures—withvaryingdegreesof
success—allthemultifariousmeaningsunderonerubric.InUXcircles,themostwidelycirculated
definitionofthislattertypeisoneproposedbyEricHulteenandGordKurtenbachintheir1990paper
GesturesinHuman-ComputerCommunication:
“Agesture is amot ion of thebody thatcon tainsin formation. Wavinggo odbyeis a
gesture.Pressingakeyonakeyboardisnotagesturebecausethemotionofafingeron
itswaytohit tingakeyis neitherobs ervednorsigni ficant.Allthatmatt ersis which
keywaspressed.”
Thisdefinitionhasthevirtueofcapturingbothwhatagestureisaswellasexplainwhatitisnot.A
formaldefinitionlikethispresentstwochallenges.Itmustavoidbeingtoospecificortoobroad.A
definitionthatistoospecific—forinstance,onethattargetsacertaintechnology—risksbecoming
obsoleteasUItechnologieschangeovertime.Asanacademicdefinitionratherthanadefinitionbased
oncommonusage,italsomustbegeneralenoughthatitincorporatesoratleastspeakstothevastbody
ofresearchthathaspreviouslybeenpublishedinHCIstudiesaswellasinsemioticsandinthearts.On
theotherhand,adefinitionthatistoobroadrisksbeingirrelevant:ifeverythingisagesture,then
nothingis.
ThecentraldistinctionproposedbyKurtenbachandHulteenisbetweenmovementsthat
communicateandonesthatdonot,betweentalking(sotospeak)anddoing.Italsoturnsouttobean
ancientdistinctioninhumanculture.IntheIliad,whichdatesbackatleasttothe6thcenturyBC,Homer
tellsusthataleadermustbetrained“…tobebothaspeakerofwordsandadoerofdeeds.”The
democraticinhabitantsofAthens,acity-statebasedonwordsandlaws,wouldknowinglyretellthestory
ofhowthetyrantXerxes,whoruledbyforcealone,floggedtheHellespontfornotobeyinghiswill.Atthe
sametime,Sisyphuscouldnevercajolehisbouldertomovebywordsalone,buthadtopushitupthe
samehill,eternally.Withoutwords,itisdifficulttomovearoomfullofpeopleintoanotherroom.With
wordsnationscanbesettowar.Atthesametime,argumentscangooninterminablywhile,byjust
kickingastone,SamuelJohnsonfamouslyrefutedthephilosopherBishopBerkeley’sstancethatmatter
wasnotrealandtheworldanillusion—oratleastheclaimedthathedid.
Whatisfascinatingisthatwhenwordsanddeedsarebroughttothetaskofbuildinghuman
computerinterfaces,theybothundergoaradicaltransformation.Speechinourinteractionswith
computersbecomesmute.Wecommunicatewithcomputingdevicesnotthroughwords,butby
pointing,prodding,andgesturing.Wetaponkeyboardkeysorontouch-awarescreens.Whenitcomes
tocomputers,weseemtopreferthisformofmutecommunicationeventhoughourcurrenttechnology
supportsmorestraightforwardvocalizedcommands.Deeds,likewise,losetheirforceaswemanipulate
notrealobjectsbutvirtualobjectswithnopersistence.Movementsbecometheequivalentofmere
gestures,aeuphemismforactionsthathavenorealresults.Gesturesandmanipulationsinmodernuser
interfacesaretheequivalentofmutewordsandemptydeeds.Inthemodernuserinterface,theline
betweenwordsanddeedsevensometimesbecomesblurred.
Ifwetakeforgrantedthat,basedontheKurtenbachandHulteendefinition,weallunderstandwhat
aUImanipulationis—provisionallyeverythingthatisnotagesture—thegreatdifficultystillremainsof
understandingwhatagestureisandwhatitmeansforagesturetobe“significant”ortosignify.Howdo
movementscommunicatemeaning?Whatiscommunicatedbyagestureisclearlydifferentfromwhat
wecommunicatethroughdiscourse.Thesignifyingwedowithgesturestendstobemoreminimalist
andsimple.
Itshouldbepointedoutthatthereisnotevengeneralagreementthatgesturescommunicate
anythingatall.Regardinga1989flagburningcasebeforetheSupremeCourtoftheUnitedStates,Chief
JusticeWilliamRehnquistarguedthatthesymbolicactofflagburningis“…theequivalentofan
inarticulategruntorroar.”Inthiscase,themajorityofthecourtstoodagainsttheChiefJusticeandruled
168
www.it-ebooks.info
CHAPTER6GESTURES
thatthegestureinquestionconstituted“expressiveconduct”protectedbytheFirstAmendment
guaranteeoffreespeech.
Inhuman-computer-interaction,gesturesaretypicallyusedtoimpartsimplecommandsrather
thancommunicatestatementsoffact,descriptionsofaffairs,orstatesofmind.Gestures,whenused
withcomputers,areimperatives,whichisnotalwaysthecasewithhumangesturesingeneral.Take,for
instance,thegestureofwaving.Inthenaturalworld,wavingisoftenusedasawaytosay“hello.”Thisis
notgenerallyusefulincomputerinterfaces.Despitethetendencyamongcomputerprogrammersto
writeprogramsthatsay“hello”tome,Ihavenointerestinsayinghellotomycomputer.
Inabusyrestaurant,however,thewavegesturemeanssomethingdifferent.WhenIwavedowna
waiter,thegesturemeans,“Hey,payattentiontome.”“Hey,payattentiontome”turnsouttobeavery
usefulcommandwhenworkingwithcomputers.Whenmydesktopcomputerfallsasleep,Ioftenstart
tappingarbitrarilyonkeyboardkeysorshakingmymouseasawaytosay,“Hey,payattentiontome.”
WithKinect,Iamabletodosomethingmuchmoreintuitivesincethereisalreadyagesturecommonto
mostculturesandfamiliartomethatsignifiesthecommand,“Hey,payattentiontome”:Iwavetoit.
Whatissignifiedbyagesture—again,inhuman-computerinteraction—isanintenttohave
somethinghappen.Agestureisacommand.WhenIusemymousetoclickonabuttoninatraditional
GUIinterface,ortaponabuttoninatouchinterface,Iwantwhateverthebuttonisintendedtodotodo
thatthing.Generally,thebuttonhasalabelthatexplainswhatitissupposedtodo:start,cancel,open,
close.Mygesturecommunicatestheintenttohavethathappen.Thatintentistheinformationreferred
toinKurtenbachandHulteen’sdefinitionabove.
Anotheraspectofgesturesimplicitintheabovedefinitionisthatgesturesarearbitrary.Movements
havenomeaningsoutsideofthemeaningsweimparttothem.Otherthanpointingand,surprisingly,
shrugging,anthropologistshavenotfoundanythingwecouldcallauniversalgesture.IncomputerUIs,
however,pointingisgenerallyconsideredadirectmanipulationsinceitinvolvestracking,whilethe
shrugissimplytoosubtletoidentify.Consequently,anygesturewewouldwanttousewithKinectmust
bebasedonanagreementbetweenusersofanapplicationandthedesignersoftheapplicationwith
respecttothemeaningofthatgesture.
Becausegesturesarearbitrary,theyarealsoconventional.Eitherthedesignersofanapplication
mustteachtheusersthesignificanceofthegesturesbeingusedortheymustdependonpre-established
conventions.Moreover,theseconventionsneedtobebasednotonculturallydeterminedrulesbut
ratherontechnologydeterminedrules.Weunderstandhowtouseamouse(alearnedbehavior,it
shouldbepointedout)notbecausethisissomethingwehaveimportedfromourculturebutbecause
thisisbasedoncross-culturalconventionsspecifictothegraphicaluserinterface.Similarly,weknow
howtotaporflickonasmartphonenotbecausetheseareculturalconventions,butratherbecausethese
arecross-culturalnaturaluserinterfaceconventions.Interestingly,weknowhowtotaponatabletin
partbecausewepreviouslylearnedhowtoclickwithamouse.Technologyconventionscanbe
transferredbetweeneachotherjustaswordsandgesturescanbeadoptedbetweendifferentlanguages
andcultures.
Ofcourse,thearbitraryandconventionalnatureofgesturesalsogivesrisetomisunderstandings—
thebiggestriskinthedesignofanyuserinterfaceandaparticularlysalientriskwithatechnologylike
Kinect,whichdoesnothavemanypre-establishedconventionstorelyon.Anthropologistsoftentellan
anecdoteaboutAmericanfootballfansabroadasawaytoexemplifythedangersofcultural
misunderstandings.Accordingtothisstory,abunchofUniversityofTexasalumniaretravellinginItaly
andenjoyinganafternooninatavernwhentheyhearthattheLonghornshavejustwonagame.They
startchantingandroamingaroundthebarflashingthehook’emsign,whichinvolvesmakingagesture
withthehandstorepresentthehornsofabull.Unfortunately(accordingtothestory),thisgestureis
understoodbyItalianmentomeanthattheyarebeingcalledcuckolds,andsoafightbreaksoutoverthe
innocentcelebration.
Thisanecdoteprovidesafinalwaytodistinguishbetweengesturesandmanipulations.To
paraphrasethesemioticianUmbertoEco,ifagesturemustsignify,thenitcanalsomis-signify.A
gesture,then,isanymotionofthebodythatcanbemisunderstood.
169
www.it-ebooks.info
CHAPTER6GESTURES
ThisisactuallythebestwaytounderstandwhyKurtenbachandHulteen,intheirdefinitionabove,
saythattappingthekeysofakeyboardisnotsignificant.WhileIcancertainlymakeanarbitrarygesture
thatKinectmisinterprets,Icannottapakeyonaphysicalkeyboardthatismisunderstood.TappingonT
alwaysmeanstappingonTandtappingonYalwaysmeanstappingonY.IfIaccidentallytaponYwhen
ImeantotypeT,thisisnotamisunderstanding,justasmispronouncingawordisnota
misunderstanding.Itissimplyamistake.
Thisisstillnotaperfectexplanationofthedifferencebetweenmanipulationsandgestures,of
course.Itdoesnottakekeyboardshortcutsintoaccount;keyboardshortcutscandefinitelybe
misinterpreteddependingontheapplicationoneisrunning.Italsodoesnotexplainwhytappingona
physicalkeyboardisamanipulationwhiletappingonavirtualkeyboardisagesture.Giventhefactthat
wearedealingwithoverarchingdefinitions,however,thesesmallidiosyncrasiesinourprovisional
definitionare,perhaps,forgivable.
Toreiterate,withinthecontextofuserinterfaces,agesture:
•
expressesasimplecommand
•
isarbitraryinnature
•
isbasedonconvention
•
canbemisunderstood
Amanipulationisanymovementthatisnotagesture.
NUI
Nodiscussionofgestureswouldbecompletewithoutmentioningthenaturaluserinterface.Natural
userinterfaceisanumbrellatermforseveraltechnologiessuchasspeechrecognition,multitouch,and
kineticinterfaceslikeKinect.Itisdistinguishedfromthegraphicaluserinterface:thekeyboardand
mouseinterfacecommontotheWindowsoperatingsystemandMacs.Thegraphicaluserinterface,in
turn,isdistinguishedfromthecommandlineinterface,whichprecededit.
What’snaturalaboutthenaturaluserinterface?EarlyproponentsofNUIproposedthatinterfaces
couldbedesignedtobeintuitivetousersbybasingthemoninnatebehaviors.Thegoalwastohave
interfacesthatdidnotneedthesteeplearningcurvetypicallyrequiredtooperateaGUI-based
applicationbuiltaroundiconsandmenus.Instead,usersshouldideallybeabletowalkuptoany
applicationandjuststartusingit.Withthegrowingproliferationoftouch-enabledsmartphonesand
tabletsoverthepastfewyears,thisnotionseemstoberealizedasweseechildrenstarttowalkuptoany
screen,expectingittorespondtotouch.
Whilethisnaturalnessofthenaturaluserinterfaceseemstobeanaptdescriptionofdirect
manipulation,thedichotomybetweennaturalandlearnedbehaviorbreaksdownwhenitcomesto
gesturesfortouchinterfaces.Somegestureslikeflickingmakeasortofintuitivesense.Others,like
doubletappingortappingandholding,havenoinnatemeaning.Moreover,asdifferentmanufacturers
havebeguntosupporttouchgesturesontheirdevices,ithasbecameevidentthatconventionsare
neededinordertomakethemeaningofcertaingesturesconsistentacrossdifferenttouchplatforms.
Thenaturalnessofthenaturaluserinterfaceturnsouttobearelativeterm.Amorecontemporary
understandingofNUIs,heavilyinfluencedbyBillBuxton,holdsthatnaturaluserinterfacestake
advantageofpre-existingskills.Theseinterfacesfeelnaturaltotheextentthatweforgethowwe
originallyacquiredthoseskills;inotherwords,weforgetthatweeverlearnedtheminthefirstplace.The
tapgesturecommontotabletsandsmartphones,forinstance,isanapplicationofaskillwealllearned
frompointingandclickingwiththemouseontraditionalgraphicaluserinterfaces.Themaindifference
betweenaclickandatapisthat,withatouchscreen,onedoesnotneedamediatingdevicetotouch
with.
170
www.it-ebooks.info
6
CHAPTER6GESTURES
Thisbringsoutanotherhallmarkofthenaturaluserinterface.Interactionbetweentheuserandthe
computershouldappearunmediated.Themediumofinteractionisinvisible.Inaspeechrecognition
interface,forexample,therearemicrophoneswithcomplexelectronicsandfilteringmechanismsthat
mediateanyhuman-computerinteraction.Therearesoftwarealgorithmsinvolvedinparsingspoken
phonemesintosemanticunits,whicharepassedtoadditionalsoftwarethatinterpretsagivenphrase
intoacommandandmapsthatcommandtosomesortoffunction.Allofthis,however,isinvisibleto
theuser.Whenauserissuesastatementsuchas,“Hey,payattentiontome,”sheexpectstoelicita
responsefromthecomputersimilartotheresponsethatthispre-existingskillprovokesinmostpeople.
Whilethesetwocharacteristicsofnaturaluserinterfaces—relianceonpre-existingskillsand
unmediatedinteraction—arecommontoeachindividualkindofNUI,otheraspectsoftouch,speech,
andkineticinterfacestendtoberemarkablydifferent.MostofthecurrentthinkingaroundNUIdesign
hasbeenbasedonthemultitouchexperience.Thisisoneofthereasonsthatthestandarddefinitionfor
gesturesdiscussedintheprevioussectionisthewayitis.Itisadaptedanddistortedformultitouch
scenarioswithacentralanddefiningdistinctionbetweengesturesandmanipulations.
Anargumentcanbemadethatgesturesandmanipulationsalsoexistinspeechinterfaces,with
commandsbeingtheequivalentofgesturesanddictationbeingtheequivalentofdirect
manipulations—thoughthismaybeastretch.Inkineticinterfaces,handorbodytrackingwithavisual
representationofthehandorbodymovingonthescreenistheequivalentofdirectmanipulations.Freeformmovementslikethewaveareconsideredgestures.
Kinectalsohasathirdclassofinteraction,however,whichhasnoequivalentintouchorspeech
interfaces.Thepose,whichisastaticrelationofapartofaperson’sbodytootherpartsofthebody,is
notamovementatall.PosingisusedonKinectforthingsliketheuniversalpause,whichistheleftarm
heldoutat45degreesfromthebody,tobringupaninteractionwindow,andverticalscrolling,which
involvesholdingtherightarmoutat45or135degreesfromthebody.
Additionally,interactiveidiomsmaybetransferredfromonetypeofinterfacetoanotherwith
varyingsuccess.Takethebutton.Thebutton,evenmorethantheicon,hasbecomethepre-eminent
idiomofthegraphicaluserinterface.Strippeddowntobasics,thebuttonisadeviceforissuing
commandstotheuserinterfaceusingthemousetopoint-and-clickonavisualelementthatdeclares
whatcommandittriggersthroughtextoranimage.Overthepastfifteenyearsorso,thebuttonhas
becomesuchanintegralpartofcomputer-humaninteractionthatithasbeenimportedintomultitouch
interfacesandfinallyeventointerfacesforKinect.Itsubiquityprovidesthenaturalnessthatispursued
bydesignersofnaturaluserinterfaces.Eachtranslationofthisidiom,however,poseschallenges.
Acommonfeatureofbuttonsingraphicaluserinterfacesisthatitprovidesahoverstatetoindicate
thattheuserhashoveredcorrectlyoverthetargetbutton.Thehoverstatefurtherbreaksdownthe
point-and-clickintodiscretemoments.Thishoverstatemayalsoprovideadditionalinformationabout
whatthebuttonisintendedtobeusedfor.Whentranslatedtotouchinterfaces,thebuttoncannot
provideahoverstate.Touchinterfacesonlyregistertouches.Consequently,comparedtoGUIs,buttons
arevisuallyimpoverishedandprovidetheabilitytoclickbutnoabilitytopoint.
InthetranslationofthebuttontoKinect-enabledinterfaces,thebuttonbecomesevenstranger.
Kinectinterfacesareinherentlytheoppositeoftouchinterfaces,providinghoverstatesbutnoabilityto
click.Oddlyenough,farfromdiscouraginguserexperiencedesignersfromusingbuttons,ithasforced
themtoconstantlyrefinethebuttonoverthepastyearofKinecttoprovidemoreandmoreingenious
waystoclickonavisualelement.Thishasvariedfromhoveringoverabuttonforasetperiodoftimeto
pushingintotheair(awkwardlyemulatingtheactofclickingonabutton)inafistpumptoposingthe
inactivearmintheair.
AlthougheventouchinterfaceshavegesturesandKinectinterfaceshaveclassesofinteractionthat
arenotgestures,thereisneverthelessatendencyamongdevelopersanddesignerstocallthesortof
interfacethatusesKinectagesturalinterface.Thereasonforthisseemstobethatgesturesunderstood
asphysicalmovementsusedforcommunicatingarethemostsalientfeatureofKinectapplications.By
contrast,thegesturesinherenttowhatcannowbethoughtofastraditionalmultitouchinterfacesseem
tobegesturesonlyinasecondarysense.Thesalientfeatureoftouchinterfacesisdirectmanipulation.
171
www.it-ebooks.info
CHAPTER6GESTURES
Whileperhapsnotexact,itisconvenienttobeabletotalkaboutthreetypesofNUI:speechinterfaces,
touchinterfacesandgesturalinterfaces.
Consequently,inliteratureaboutKinect,youmayfindthatevenposesandmanipulationsare
describedasgestures.Thereisnothingwrongwiththis.Justbearinmindthatwhenwediscuss
movementssuchasthewaveortheswipeasKinectidioms,weshouldthinkofthemaspuregestures,
whileposesandmanipulationsaregesturesonlyinametaphoricalsense.
Thisisimportantsince,aswefurtherdesigninteractionidiomsforKinect,wewilleventuallymove
awayfromborrowingidiomslikethebuttonfromotherinterfacestylesandwillattempttointerpretpreestablishedidioms.Thewave,whichistheepitomeofpuregesturingonKinect,isanearlyattemptto
accomplishthis.ResearchersatTheGeorgiaInstituteofTechnologyarecurrentlyworkingonusing
KinecttointerpretAmericanSignLanguage.Otherresearchers,inturn,areworkingonusingKinectto
interpretbodylanguage—anotherpre-establishedformofgesturalandposedcommunication.These
sortsofresearchcanbethoughtofasthesecondwaveofNUIresearch.Theycomeclosertofulfillingthe
originalNUIdreamofahuman-computerinterfacethatisnotonlyinvisiblebutthatadaptsitselfto
understandingusratherthanforcingustounderstandourcomputers.
WhereDoGesturesComeFrom?
Ingesturalinterfaces,puregestures,poses,andtrackingcanbecombinedtocreateinteractionidioms.
ForKinect,therearecurrentlyeightcommongesturesinuse:thewave,thehoverbutton,themagnet
button,thepushbutton,themagneticslide,theuniversalpause,verticalscrolling,andswiping.Where
dotheseidiomscomefrom?SomeoftheseidiomswereintroducedbyMicrosoftitself.Somewere
designedbygamevendors.SomewerecreatedbyKinectforPCdeveloperstryingtofindwaystobuild
applications,ratherthangames,usingKinect.
Thisisararemomentintheconsumerizationofahumancomputerinteractionidiom.Itisactually
unusualtobeabletoidentifyeightgesturesandclaimthattheseareallthestandardgesturescommonly
acknowledgedandsharedwithinagivenclassofapplications.Similarmomentscanbeidentifiedinthe
formulationofwebidiomsandsmartphonegesturesaspeopletriedoutnewdesigns,onlysomeof
whicheverbecamestandard.Inwebdesign,themarqueeandcursoranimationsbothhadtheirdayin
thesunandthenquicklydisappearedunderaheapofscorn.Insmartphonedevelopment,thisevolution
ofidiomswascontrolledsomewhatbetter,becauseofApple’searlypositioninthetouch-enabled
smartphonemarket.Appleintroducedelementsofwhathassincebecomeourtouchlinguafranca(see
Figure6-1):thetap,thetapandhold,theswipe,thepinch.Nevertheless,in2007,agreatnumberof
articlesappearedaskingwhowouldstandardizeourtouchgesturesforusasmoreandmorevendorsgot
intothesmartphonebusiness.
172
www.it-ebooks.info
CHAPTER6GESTURES
Figure6-1.Commontouchgestures
Thereareseveralbarrierstotheconventionalizationofinteractionidioms.Thefirstisthatthereis
sometimesmuchtobegainedinavoidingstandardization.Wesawthisinthebrowserwarsofthelate
90swhere,despitelipservicebeingpaidtotheimportanceofconventionalization,browsermakers
consistentlycreatedtheirowndialectsofHTMLinordertolockdevelopersintotheirtechnology.Device
makerscanlikewisetakeadvantageofmarketsharetolockconsumersintotheirgesturesandmake
gestureimplementationsonotherphonesseemnon-intuitivesimplybecausetheyaredifferentand
seeminglyunnatural—thatis,theyrequirerelearning.
Asecondbarriertoconventionalizationisthepatentingofcontextualgestures.Forinstance,Apple
cannotpatenttheswipe,butitcanandhaspatentedtheswipeforunlockingphones.Thisforcesother
devicemanufacturerstopayApplefortheprivilegeofusingtheswipetounlockdevices,fightApplein
thecourtsinordertomaketheconventionfree,orjustnotusethatcontextualgesture.Notusingit,
however,breakstheconventionwehavebeentaughtisthemostnaturalwaytounlocksmartphones,
musicplayers,andtablets.
Afinalbarrieristhatdesigninggesturesisjustdifficult.Gesturalidiomsfacethesameproblemthat
phoneappsintheAppStoreandvideosonYouTubeface:peopleeithertaketothemortheydon’t.
Gesturesthatrequireoverthinkingtolearnsimplywillnotbeadopted.Thisisthetyrannyofthelong
tail.
Whatmakesagoodgesturalidiom?Gesturesareconsideredgoodiftheyareusable.Ininteraction
design,twoconceptsessentialforusabilityareaffordanceandfeedback.Feedbackisanythingthatlets
theuserknowheisdoingsomething.Ontheweb,buttonslookoffsettoindicatethataninteractionhas
beensuccessful.Themouseprovidesafaintclicksoundforthesamereason.Itreassuresusthemouse
buttonisworking.IntheWindowsPhoneMetrostyle,tilestilt.Developersaretaughtthattheirbuttons
needtobelargeenoughtoallowforlargetouchareas,buttheyshouldalsobetaughttomakethemlarge
enoughtoregisterfeedbackevenwhentheuser’sfingeroccludesthetoucharea.Additionally,status
messagesorconfirmationwindowsmayalsopopupinanapplicationtomakeuscertainthatsomething
173
www.it-ebooks.info
CHAPTER6GESTURES
hashappened.IntheXboxdashboard,hoveringoverahotspotusingtheKinectsensorcausesthecursor
toplayananimation.
Iffeedbackiswhathappensduringandafteranactivity,affordanceiswhathappensbeforean
activity.Anaffordanceisacuethattellsusersthatavisualelementisinteractiveand,ideally,indicatesto
theuserwhatthevisualelementisusedfor.InGUIinterfaces,thebestidiomtoaccomplishthishas
alwaysbeenthebutton.Thebuttoncommunicatesitsfunctionwithtextoriconography.ForGUI
buttons,ahoverstatemayprovideadditionalinformationaboutwhatthebuttonisusedfor.Thebest
affordance—andthisisabitcircular—happenstobeconvention.Auserknowswhatavisualelementis
for,becauseshehasusedsimilarvisualelementsinotherapplicationsandotherdevicestoperformthe
sameactivity.ThisisdifficultwithaKinect-basedgesturalinterface,however,becauseeverythingisstill
sonew.
Thetricktogettingaroundthisistouseconventionsfromotherkindsofinterfaces.Thetapgesture
intouchinterfacesisadirectcorrelatetoclickingwithamouse.Thetwovisualelementsusedtoregister
taps,iconsandbuttons,areevendesignedexactlythesameasiconsandbuttonsonGUIs,toprovidean
extraclueastohowtousethem.Kinectinterfacescurrentlyalsousebuttonsandiconstomakeusing
themeasiertolearn.BecauseKinecttechnologybasicallymakespointingeasybuthasnonativesupport
forclicking,muchoftheeffortinthisfirstyearoftheconsumerizationofgesturalinterfaceshasbeen
devotedtoimplementingtheclick.
Unliketouchinterfaces,gesturalinterfaceshaveanadditionalreservoirofconventionstodrawon
fromtheworldofhumangestures.ThisiswhatmakesthewavethequintessentialgestureforKinect.It
hasasymbolicconnectiontoreal-worldmovementthatmakesiteasytounderstandanduse.Tracking,
thoughnottechnicallyagesture,isanotheridiomthathasareal-worldcorrelateinpointing.WhenI
movemyhandaroundinfrontofthetelevisionscreenormonitor,agoodKinectinterfaceprovidesa
cursorthatmovesalongwithmyhand.InKinecthandtracking,thecursoristhefeedbackwhile
pointinginthenaturalworldistheaffordance.
Currently,real-worldaffordancesarelittleusedinKinectinterfaceswhileGUIaffordancesare
common.Thiswillhopefullyshiftovertime.Ontouchdevices,newgesturestaketheformofadding
additionalfingerstothealreadyestablishedconventions.Atwo-fingertapsignifiessomethingdifferent
fromone-fingertaps.Two-fingerandeventhree-fingerswipesarebeinggivenspecialmeanings.
Eventually,thetouchgesturalvocabularywillrunoutoffingerstoworkwith.Truegesturalinterfaces,on
theotherhand,haveanearinfinitevocabularytoworkwithifwebegintobasethemontheirreal-world
correlates.
Theremainderofthechapterwillmovefromtheorytopractice,providingguidancefor
implementingtheeightcommonKinectgesturescurrentlyincirculation:thewave,thehoverbutton,
themagnetbutton,thepushbutton,themagneticslide,theuniversalpause,verticalscrolling,and
swiping.
ImplementingGestures
TheMicrosoftKinectSDKdoesnotincludeagesturedetectionengine.Therefore,itislefttodevelopers
todefineanddetectgestures.SincethereleaseofthefirstbetaversionoftheSDK,afewthird-party
effortstowardscreatingagestureenginehavesurfaced.However,nonehasrisentobecomethe
standardoracceptedtool.ThingsarelikelytoremainthiswayuntilMicrosoftaddsitsowngesture
detectionenginetotheSDKormakeitobviousthatitisclearingthewayforsomeoneelsetodoso.This
sectionservesasanintroductiontogesturedetectiondevelopmentinthehopeofprovidingdevelopers
enoughtobeself-sufficientuntilastandardsetoftoolmaterialize.
Gesturedetectioncanberelativelysimpleorintenselycomplex,dependingonthegesture.There
arethreebasicapproachestodetectinggestures:algorithmic,neuralnetwork,andbyexample.Eachof
thesetechniqueshasitsstrengthsandweaknesses.Themethodologyadevelopermaychoosewill
dependonthegesture,needsofyourproject,timeavailable,anddevelopmentskill.Thealgorithmic
174
www.it-ebooks.info
CHAPTER6GESTURES
approachisthemostsimpleandeasytoimplement,whereastheneuralnetworkandexemplarsystems
arecomplexandnon-trivial.
AlgorithmicDetection
Algorithmsarethebasicapproachtosolvingvirtuallyallsoftwaredevelopmentproblems.Using
algorithmsisasimpleprocessofdefiningrulesandconditionsthatmustbesatisfiedtoproducearesult.
Inthecaseofgesturedetection,theresultisbinary.Agestureiseitherperformedornotperformed.
Usingalgorithmstodetectgesturesisthemostbasicapproachbecauseitiseasytocode,relatively
simpleforanydeveloperofanyskillleveltointerpret,write,andmaintain,andstraightforwardto
debug.
Thisdirectapproach,however,isalsoanencumbrance.Thesimplisticnatureofalgorithmscan
limitthetypesofgesturestheycandetect.Analgorithmictechniqueisappropriatefordetectingawave
butnotfordetectingathroworaswing.Themovementsoftheformerarecomparablymoresimpleand
uniform,whereasthoseofthelatteraremorenuancedandvariable.Whileitispossibletowriteaswing
detectionalgorithm,thecodeislikelytobebothconvolutedandfragile.
Thereisalsoaninherentscalabilityproblemwithalgorithms.Althoughsomecodereuseispossible,
eachgesturemustbedetectedusingabespokealgorithm.Asnewgestureroutinesareaddedtoalibrary,
thesizeofthelibrarygrowsincreasinglarger.Thiscreatesadditionalproblemsintheperformanceofthe
detectionroutineasalargernumberofalgorithmsmustexecutetodetermineifagesturehasbeen
performed.
Finally,eachgesturealgorithmrequiresdifferentparameterssuchasdurationandthresholds.This
willbecomemoreobviousasweexplorethespecificimplementationofcommongesturesinthe
sectionstofollow.Developersmusttestandexperimenttodeterminetheappropriateparametersfor
eachalgorithm.Thisitselfisachallengeand,ifnothingelse,atediousundertaking.However,each
gesturedetectionprocesshasthisparticularproblem.
NeuralNetworks
Whenauserperformsagesture,theformofthegestureisnotalwayscrispenoughtomakeaclear
determinationoftheuser’sintent.Forinstance,thereisthejumpgesture.Ajumpiswhentheuser
temporarilypropelsherselfintotheair,suchthatherfeetlosecontactwiththefloor.Thisdefinition,
whileaccurate,isnotadequatetodetectthejumpgesture.
Atfirstblush,ajumpseemseasyenoughtodetectusinganalgorithm.First,considerthedifferent
formsofjumping:basicjumping,hurdling,longjumping,hopping,andsoon.However,thebigger
impedimentisthatitisnotalwayspossible,duetothelimitsimposedbyKinect’sviewarea,to
determinewheretheflooris,makingitimpossibletodetectwhenthefeethaveleftthefloor.Imaginea
jumpwheretheuserbendsatthekneestothepointofasquatandthenpropelshimselfintotheair.
Shouldthegesturedetectionengineevaluatethisasasinglegestureormultiplegestures:squatorduckand-jump,orjustajump?Iftheuserpauseslongerinthesquatposeandthentheforcebywhichhe
propelshimselfupwardisminimal,thenthisgestureshouldbeevaluatedasasquatandnotajump.
Theoriginaldefinitionofajumphasquicklydisintegratedintoambiguity.Thegestureisdifficultto
defineclearlyenoughtowriteanalgorithmwithoutthealgorithmbecomingunmanageableand
unstableduetotheburdensomerulesandconditions.Thebinarystrategyofevaluatingusermovements
algorithmicallyistoosimplisticandnotrobustenoughforgestureslikejump,duck,andsquat.
Neuralnetworksorganizeandevaluatebasedonstatisticsandprobabilitiesandthereforemake
detectinggestureslikethismoremanageable.Agesturedetectionenginebasedonaneuralnetwork
wouldsaythereisan80%chancetheuserjumpedanda10%chancehesquatted.
Beyondbeingabletodetectcomplexandsubtledifferencesbetweengestures,aneuralnetwork
approachresolvesthescalabilityissuesofthealgorithmmodel.Thenetworkconsistsofnodeswhere
175
www.it-ebooks.info
CHAPTER6GESTURES
eachnodeisatinyalgorithmtoevaluatesmallelementsofagestureormovement.Inaneuralnetwork,
severalgestureswillsharenodes,butnevertheexactsamecombinationorsequenceofnodes.Further,
neuralnetworksareefficientdatastructuresforprocessinginformation.Thismakesthemahighly
performantmeansofdetectinggestures.
Thedownsideofusingthisapproachisthatitisnaturallycomplex.Whileneuralnetworksandthe
applicationofthemincomputersciencehavebeenaroundfordecades,buildingthemisnota
commonplacetaskforthevastmajorityofapplicationdevelopers.Mostdevelopersarelikelytohave
usedgraphsandtreesonlyindatastructurecoursesincollege,butnothingtothescaleofaneural
networkortheimplementationoffuzzylogic.Nothavingexperiencebuildingthesenetworksisa
formidableimpediment.Evenforthosedevelopersskilledinbuildingneuralnetworks,thisapproachis
difficulttodebug.
Aswiththealgorithmicapproach,neuralnetworksrelyonagreatnumberofparameterstoproduce
accurateresults.Thenumberofparametersgrowswitheachnode.Aseachnodecanbeusedtodetect
multiplegestures,anychangetotheparametersofonenodewillaffectthedetectionofgestureson
othernodes.Configurationandtweakingtheseparametersareconsequentlymoreartthanscience.
However,whenneuralnetworksarepairedwithmachinelearningprocessesthatadjusttheparameters
manually,thesystemcanbecomehighlyaccurateovertime.
DetectionbyExample
Anexemplarortemplate-basedgesturerecognitionsystemcomparestheuser’smovementswithknown
gestureforms.Theuser’smovementsarenormalizedandthenusedtocalculatetheprobabilityofan
accuratematch.Therearetwoformsoftheexemplarapproach.Onemethodstoresacollectionof
points,whiletheotherusesaprocesssimilartotheKinectSDK’sskeletontrackingsystem.Inthelatter,
thesystemcontainsasetofskeletonordepthframesandstatisticalanalysismatchestheliveframewith
aknownframe.
Thisapproachtogesturedetectionishighlyconducivetomachinelearning.Theenginewould
record,process,andreuseliveframes,andthereforebecomemoreaccurateinitsdetectionofgestures
overtime.Thesystemwouldbetterunderstandhowyouspecificallyperformgestures.Suchasystem
canmoreeasilybetaughtnewgesturesandcanhandlecomplexgesturesmoresuccessfullythananyof
theotherapproaches.However,thisisnotatrivialsystemtoimplement.First,thesystemrelieson
exampledata.Themoredataprovided,thebetterthesystemworks.Asaresult,thesystemisresource
intensive,requiringstorageforthedataandCPUcyclestosearchforamatch.Second,thesystemneeds
examplesofusersofdifferentshapes,sizes,anddress(clothingaffectsthedepthblobshape)performing
severalvariationsofthesamegesture.Forexample,taketheswingingofabaseballbat.Thereareas
manyvariationsofabatswingastherearevariationsofthrowingabaseball.Ausefuldetection-byexamplesystemmusthaveexampledatanotonlyofhowdifferentpeopleperformagivengesturebut
alsoofthevarietyofwaysasinglepersonmightperformthatgesture.
DetectingCommonGestures
Choosingagesturedetectionapproachdependsontheneedsofyourproject.Iftheprojectusesonlya
fewsimplegestures,theneitherthealgorithmicorneuralnetworkapproachisrecommended.Forall
othertypesofprojects,itmaybeinyourbestinteresttoinvestthetimetobuildareusablegesture
detectionengineortouseoneofthefewavailableonline.Inthesectionsthatfollow,wedescribeseveral
commongesturesanddemonstratehowtoimplementthemusingthealgorithmicapproach.Theother
twomethodsofgesturedetectionarebeyondthescopeofthisbookduetotheircomplexandadvanced
nature.
Regardlessofthesystemchosentodetectgestures,eachmustaccountforvariationsinthe
performanceofagesture.Thesystemmustbeflexibleandallowformultiplerangesofmotionforthe
176
www.it-ebooks.info
CHAPTER6GESTURES
samegesture.Rarelydoesapersonperformthesamegestureexactlythesamewayeverytime,much
lessthesameasotherusers.Forexample,makeacirculargesturewithyourlefthand.Nowrepeatthat
gesturetentimes.Wastheradiusofthecirclethesameeachtime?Didthecirclestartandendatexactly
thesamepointinspace?Didyoucompletethecircleinthesamedurationeachtime?Performthesame
experimentwithyourrighthandandcomparetheresults.Nowgrabfriendsandfamilyandobserve
themmakingcircles.Standinfrontofamirrorandwatchyourselfgesture.Useavideorecorder.The
trickistoobserveasmanypeopleaspossibleperformingagestureandattempttonormalizethe
movement.Agoodroutineforgesturedetectionfocusesonthecorecomponentsofthegestureand
ignoreseverythingelseasextraneous.
TheWave
AnyonewhohasplayedaKinectgameontheXboxhasperformedthewavegesture.Thewaveisa
simplemotionthatanyonecandoregardlessofageorsize.Itisafriendlyandhappygesture.Tryto
waveandbeunhappy.Apersonwavestosayhelloandgood-bye.Inthecontextofgestureapplication
development,thewavetellstheapplicationthattheuserisreadytobegintheexperience.
Thewaveisabasicgesturewithsimplemovements.Thismakesiteasytodetectusingan
algorithmicapproach;however,anydetectionmethodologypreviouslydescribedalsoworks.Whilethe
waveisaneasygesturetoperform,howdoyoudetectawaveusingcode?Startbystandinginfrontofa
mirrorandwavingatyourself(asthisauthorhas,admittedly,donerepeatedlyinthecourseofwriting
thissection).Takenoteofthemotionyoumakewithyourhand.Payattentiontotherelationship
betweenthehandandthearmduringthegesture.Continuewatchingthehandandthearm,butnow
observehowtheentirebodytendstomovewhilemakingthegesture.Thinkofthedifferentwaysother
peoplewavethatisdifferentfromyourwave.Somepeoplewavebykeepingtheirbodyandarmstill,and
oscillatethehandfromsidetosideatthewrist.Otherskeepthebodyandarmstill,butmovethehand
forwardandbackwardatthewrist.Thereareseveralotherformsofahandwave.Researchthewave
gesturebyobservingthewayotherswave.
ThewavegestureusedontheXboxstartswiththearmextendedandbentattheelbow.Theuser
movestheforearmwiththeelbowasapivotpointbackandforthalongaplanethatisroughlyinline
withtheshoulders.Thearmisparalleltothefloor.Atthemidpointofthewavegesture,theforearmis
perpendiculartoboththeupperarmandthefloor.Figure6-2illustratesthisgesture.Thefirst
observationfromtheseimagesisthatthehandandwristareabovetheelbowandtheshoulder,whichis
consistentformostwavemotions.Thisisourfirsttestablecriteriaforapotentialwavegesture.
Figure6-2.Auserwaving
177
www.it-ebooks.info
CHAPTER6GESTURES
ThefirstframeofFigure6-2showsthegestureintheneutralposition.Theforearmisperpendicular
totherestofthearm.Ifthehandbreaksthispositionbymovingeithertotheleftortotheright,we
considerthisasegmentofthegesture.Fortheretobeawavegesture,thehandmustoscillatemultiple
timesbetweeneachsegment,otherwiseitisanincompletegesture.Thismovementbecomesour
secondobservation:awaveoccurswhenthehandorwristoscillatessomespecifiednumberoftimesto
theleftorrightoftheneutralposition.Usingthesetwoobservations,wecanbuildasetofrulestocode
analgorithmtodetectawavegesture.
Thealgorithmcountsthenumberoftimesthehandbreakstheneutralzone.Theneutralzoneis
definedbyanarbitrarythresholdfromtheelbowpoint.Thedetectionschemealsorequirestheuserto
performthegesturewithinaspecificduration,otherwisethegesturefails.Thewavegesturedetection
algorithmdefinedhereisdesignedtostandaloneandnottobeincludedwithanoverarchinggesture
detectionsystem.Itmaintainsitsownstateandprovidesnotificationofacompletedgestureusingan
event.Thewavedetectormonitorsmultipleusersandbothhandsforthewavegesture.Thegesturecode
evaluateswitheachnewskeletonframe,andassuch,mustmaintainitsdetectionstate.
ThecodestartswithListing6-1,whichdetailstwoenumerationsandastructusedtotrackgesture
state.Thefirstenumeration,WavePosition,definesthedifferentpositionsofthehandduringthewave
gesture.ThegesturedetectorclassusestheWaveGestureStateenumerationtotrackthestateofeach
user’shand.TheWaveGestureTrackerisastructureusedtoholddataneededinthedetectionofthe
gesture.IthasaResetmethodusedwhentheuser’shandfailstomeetthebasiccriteriaofthewave
gesture,suchaswhenthehandisbelowtheelbow.
178
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-1.BuildingaFoundationfortheWaveGesture
private enum WavePosition
{
None
= 0,
Left
= 1,
Right
= 2,
Neutral = 3
}
private enum WaveGestureState
{
None
= 0,
Success
= 1,
Failure
= 2,
InProgress = 3
}
private struct WaveGestureTracker
{
#region Methods
public void Reset()
{
IterationCount
= 0;
State
= GestureState.None;
Timestamp
= 0;
StartPosition
= WavePosition.None;
CurrentPosition
= WavePosition.None;
}
#endregion Methods
}
#region Fields
public int IterationCount ;
public WaveGestureState State;
public long Timestamp;
#endregion Fields
Listing6-2detailsthebaseofthewavegestureclass.Threeconstantsdefinethewavegesture:the
neutralzonethreshold,gestureduration,andnumberofmovementiterations.Thesevaluesshould
ideallybeconfigurableparametersbutareshownasconstantsforsimplicity.TheWaveGestureTracker
arrayholdsthegesturetrackingstateforeachpossibleuserandhand.Theclassraisesthe
GestureDetectedeventwhenawaveisdetected.
ThemainapplicationwillcalltheUpdatemethodoftheWaveGestureclasseachtimeanewframeis
available.Thecodeinthismethoditeratesthrougheachtrackedskeletonintheframeandevaluates
boththeleftandrighthandsoftheuserbycallingtheTrackWavemethod.Anyskeletonnotactively
trackedwillresetthegesturestate.
179
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-2.AWaveDetectionClass
public class WaveGesture
{
#region Member Variables
private const float WAVE_THRESHOLD
private const int WAVE_MOVEMENT_TIMEOUT
private const int REQUIRED_ITERATIONS
= 0.1f;
= 5000;
= 4;
private WaveGestureTracker[,] _PlayerWaveTracker = new WaveGestureTracker[6,2];
public event EventHandler GestureDetected;
#endregion Member Variables
#region Methods
public void Update(Skeleton[] skeletons, long frameTimestamp)
{
if(skeletons != null)
{
Skeleton skeleton;
for(int i = 0; i < skeletons.Length; i++)
{
skeleton = skeletons[i];
if(skeleton.TrackingState != SkeletonTrackingState.NotTracked)
{
TrackWave(skeleton, true,
ref this._PlayerWaveTracker[i, LEFT_HAND], frameTimestamp);
TrackWave(skeleton, false,
ref this._PlayerWaveTracker[i, RIGHT_HAND], frameTimestamp);
}
else
{
this._PlayerWaveTracker[i, LEFT_HAND].Reset();
this._PlayerWaveTracker[i, RIGHT_HAND].Reset();
}
}
}
}
}
#endregion Methods
TheTrackWavemethod(Listing6-3),doestherealworkofdetectingthewavegesture.Itperforms
thevalidationwepreviouslydefinedtoconstituteawavegestureandupdatesthegesturestate.Itis
writtentodetectwavesfromeithertheleftorrighthand.Thefirstvalidationdeterminesifboththehand
andtheelbowpointsareactivelytracked.Thetrackingstateisresetifeitherofthetwopointsare
unavailable,otherwisethevalidationmovestothenextphase.
Iftheuserhasnotmovedprogressivelytothenextphaseofthegesturebeforethedefinedduration
passesthegesture,trackingexpiresandthetrackingdataresets.Thenextvalidationdeterminesifthe
handisabovetheelbow.Ifnot,thenthegestureeitherfailsorresetsdependingonthecurrenttracking
180
www.it-ebooks.info
CHAPTER6GESTURES
state.IfthehandishigherontheY-axisthantheelbow,themethoddeterminesthepositionofthehand
inrelationtotheelbowontheY-axis.TheUpdatePositionmethodiscalledwiththeappropriatehand
positionvalue.Afterupdatingthepositionofthehand,thefinalcheckistoseeifthenumberofrequired
iterationsissatisfied.Ifso,thenawavegesturehasbeendetectedandtheGestureDetectedeventis
raised.
Listing6-3.TrackingaWaveGesture
private void TrackWave(Skeleton skeleton, bool isLeft,
ref WaveGestureTracker tracker, long timestamp)
{
JointType handJointId
= (isLeft) ? JointType.HandLeft : JointType.HandRight;
JointType elbowJointId
= (isLeft) ? JointType.ElbowLeft : JointType.ElbowRight;
Joint hand
= skeleton.Joints[handJointId];
Joint elbow
= skeleton.Joints[elbowJointId];
if(hand.TrackingState != JointTrackingState.NotTracked &&
elbow.TrackingState != JointTrackingState.NotTracked)
{
if(tracker.State == WaveGestureState.InProgress &&
tracker.Timestamp + WAVE_MOVEMENT_TIMEOUT < timestamp)
{
tracker.UpdateState(WaveGestureState.Failure, timestamp);
}
else if(hand.Position.Y > elbow.Position.Y)
{
//Using the raw values where (0, 0) is the middle of the screen.
//From the user's perspective, the X-Axis grows more negative left
//and more positive right.
if(hand.Position.X <= elbow.Position.X - WAVE_THRESHOLD)
{
tracker.UpdatePosition(WavePosition.Left, timestamp);
}
else if(hand.Position.X >= elbow.Position.X + WAVE_THRESHOLD)
{
tracker.UpdatePosition(WavePosition.Right, timestamp);
}
else
{
tracker.UpdatePosition(WavePosition.Neutral, timestamp);
}
if(tracker.State != WaveGestureState.Success &&
tracker.IterationCount == REQUIRED_ITERATIONS)
{
tracker.UpdateState(WaveGestureState.Success, timestamp);
if(GestureDetected != null)
{
GestureDetected(this, new EventArgs());
181
www.it-ebooks.info
CHAPTER6GESTURES
}
}
}
else
{
if(tracker.State == WaveGestureState.InProgress)
{
tracker.UpdateState(WaveGestureState.Failure, timestamp);
}
else
{
tracker.Reset();
}
}
}
else
{
}
tracker.Reset();
}
Listing6-4detailsmethodsthatshouldbeaddedtotheWaveGestureTrackerstructure.Thesehelper
methodsassistinmaintainingthestructure’sfieldsandmakethecodeintheTrackWavemethodeasierto
read.TheUpdatePositionmethodistheonlymethodofnote.Eachtimeitisdeterminedthatthehand
haschangedposition,thismethodiscalledbyTrackWave.Itscorepurposeistoupdatethe
CurrentPositionandTimestampproperties.Thismethodisalsoresponsibleforupdatingthe
InterationCountfieldandforchangingtheStatetoInProgress.
182
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-4.HelperMethodstoUpdateTrackingState
public void UpdateState(WaveGestureState state, long timestamp)
{
State
= state;
Timestamp
= timestamp;
}
public void Reset()
{
IterationCount
State
Timestamp
StartPosition
CurrentPosition
}
=
=
=
=
=
0;
WaveGestureState.None;
0;
WavePosition.None;
WavePosition.None;
public void UpdatePosition(WavePosition position, long timestamp)
{
if(CurrentPosition != position)
{
if(position == WavePosition.Left || position == WavePosition.Right)
{
if(State != WaveGestureState.InProgress)
{
State
= WaveGestureState.InProgress;
IterationCount = 0;
StartPosition
= position;
}
IterationCount++;
}
}
CurrentPosition = position;
Timestamp
= timestamp;
}
BasicHandTracking
Handtrackingistechnicallydifferentfromgesturedetection.Itis,however,thebasisformanyformsof
gesturedetection.Beforegoingthroughthedetailsofbuildingindividualgesturecontrols,wewillbuild
asetofreusableclassesforsimpletrackingofhandmotions.Thishandtrackingutilitywillalsoincludea
visualfeedbackmechanismintheformofananimatedcursor.Ourhandtrackerwillalsointeractwith
controlsinahighlydecoupledmanner.
BeginbycreatinganewprojectbasedontheWPFControlLibraryprojecttemplate.Addfourclasses
totheproject:KinectCursorEventArgs.cs,KinectInput.cs,CursorAdorner.cs,and
KinectCursorManager.cs.Thesefourclasseswillinteractwitheachothertomanagethecursorposition
183
www.it-ebooks.info
CHAPTER6GESTURES
basedontherelativelocationoftheuser’shand.TheKinectInputclassisacontainerforeventsthatwill
besharedbetweentheKinectCursorManagerandseveralcontrolswewillsubsequentlybuild.
KinectCursorEventArgsprovidesapropertybagforpassingdatatoeventhandlersthatlistenforthe
KinectInputevents.KinectCursorManager,asthenameimplies,managestheskeletonstreamfromthe
Kinectsensor,translatesitintoWPFcoordinates,providesvisualfeedbackaboutthetranslatedscreen
position,andlooksforcontrolsonthescreentopasseventsto.Finally,CursorAdorner.cswillcontain
thevisualelementthatprovidesavisualrepresentationofthehand.
KinectCursorEventArgsinheritsfromtheRoutedEventArgsclass.Itcontainsfourproperties:X,Y,Z,
andCursor.X,Y,andZarenumericalvaluesrepresentingthetranslatedwidth,height,anddepth
coordinatesoftheuser’shand.Cursorisapropertyholdinganinstanceofthespecializedclass,
CursorAdorner,whichwewilldiscusslater.Listing6-5showsthebasicstructureofthe
KinectCursorEventArgsclasswithsomeoverloadedconstructors.
Listing6-5.StructureoftheKinectCursorEventArgsClass
public class KinectCursorEventArgs: RoutedEventArgs
{
public KinectCursorEventArgs(double x, double y)
{
X = x;
Y = y;
}
public KinectCursorEventArgs(Point point)
{
X = point.X;
Y = point.Y;
}
public double X { get; set; }
public double Y { get; set; }
public double Z { get; set; }
public CursorAdorner Cursor { get; set; }
// . . .
}
TheRoutedEventArgsbaseclassalsohasaconstructorthattakesaRoutedEventasaparameter.This
somewhatunusualsignatureisusedinspecialsyntaxforraisingeventsfromUIElementsinWPF.As
illustratedinListing6-6,theKinectCursorEventArgsclasswillimplementthissignatureaswellasseveral
additionaloverloadsthatwillbehandylater.
184
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-6.ConstructorOverloadsforKinectCursorEventArgs
public KinectCursorEventArgs(RoutedEvent routedEvent): base(routedEvent){}
public KinectCursorEventArgs(RoutedEvent routedEvent, double x, double y, double z)
: base(routedEvent) { X = x; Y = y; Z = z; }
public KinectCursorEventArgs(RoutedEvent routedEvent, Point point)
: base(routedEvent) { X = point.X; Y = point.Y; }
public KinectCursorEventArgs(RoutedEvent routedEvent, Point point, double z)
: base(routedEvent) { X = point.X; Y = point.Y; Z = z; }
public KinectCursorEventArgs(RoutedEvent routedEvent, object source)
: base(routedEvent, source) {}
public KinectCursorEventArgs(RoutedEvent routedEvent, object source,
double x, double y, double z)
: base(routedEvent, source) { X = x; Y = y; Z = z; }
public KinectCursorEventArgs(RoutedEvent routedEvent, object source,
Point point)
: base(routedEvent, source) { X = point.X; Y = point.Y; }
public KinectCursorEventArgs(RoutedEvent routedEvent, object source,
Point point, double z)
: base(routedEvent, source) { X = point.X; Y = point.Y; Z = z; }
Next,wewillcreateeventsintheKinectInputclasstopassmessagesfromtheKinectCursorManager
tovisualcontrols.TheseeventswillpassmessagesasKinectCursorEventArgstypes.Openthe
KinectInput.csfileandaddthedelegatetypeKinectCursorEventHandlertothetopoftheclass.Then
add1)astaticroutedeventdeclaration,2)anaddmethod,and3)aremovemethodforthe
KinectCursorEnterevent,theKinectCursorLeaveevent,theKinectCursorMoveevent,the
KinectCursorActivatedevent,andtheKinectCursorDeactivatedevent.Listing6-7illustratesthecodefor
thefirstthreecursor-relatedevents.Simplyfollowthesamemodelinordertoaddthe
KinectCursorActivatedandKinectCursorDeactivatedroutedevents.
185
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-7.KinectInputEventDeclarations
public delegate void KinectCursorEventHandler(object sender, KinectCursorEventArgs e);
public static class KinectInput
{
public static readonly RoutedEvent KinectCursorEnterEvent =
EventManager.RegisterRoutedEvent("KinectCursorEnter", RoutingStrategy.Bubble,
typeof(KinectCursorEventHandler), typeof(KinectInput));
public static void AddKinectCursorEnterHandler(DependencyObject o,
KinectCursorEventHandler handler)
{
((UIElement)o).AddHandler(KinectCursorEnterEvent, handler);
}
public static void RemoveKinectCursorEnterHandler(DependencyObject o,
KinectCursorEventHandler handler)
{
((UIElement)o).RemoveHandler(KinectCursorEnterEvent, handler);
}
public static readonly RoutedEvent KinectCursorLeaveEvent =
EventManager.RegisterRoutedEvent("KinectCursorLeave", RoutingStrategy.Bubble,
typeof(KinectCursorEventHandler), typeof(KinectInput));
public static void AddKinectCursorLeaveHandler(DependencyObject o,
KinectCursorEventHandler handler)
{
((UIElement)o).AddHandler(KinectCursorEnterEvent, handler);
}
public static void RemoveKinectCursorLeaveHandler(DependencyObject o,
KinectCursorEventHandler handler)
{
((UIElement)o).RemoveHandler(KinectCursorEnterEvent, handler);
}
// . . .
}
YouwillnoticethatthereisneitherhidenorhairoftheClickeventcommontoGUIprogramming
inthecodewehavewrittensofar.ThisisbecausethereisnoclickinginKinectandwewanttomakethis
clearinthedesignofourcontrollibrary.Instead,thetwofirst-classconceptsintrackingwithKinectare
enterandleave.Thehandcursormayentertheareatakenupbyacontrolandthenitcanleaveit.
Clicking,whenwewanttousecontrolsinawaythatisanalogoustoGUIcontrols,mustbesimulated
sinceKinectdoesnotaffordanativewaytoperformthismovement.
TheCursorAdornerclass,whichwillholdthevisualelementthatrepresentstheuser’shand,inherits
fromtheWPFAdornertype.Wedothisbecauseadornershavethepeculiarcharacteristicofalwaysbeing
drawnontopofotherelements,whichisusefulinourcasebecausewedonotwantourcursortobe
obscuredbyanyothercontrols.AsshowninListing6-8,ourcustomadornerwilldrawadefaultvisual
186
www.it-ebooks.info
CHAPTER6GESTURES
elementtorepresentthecursorbutcanalsobepassedacustomvisualelement.Itwillalsobootstrapits
ownCanvaspanelinwhichitwilllive.
Listing6-8.TheCursorAdornerClass
public class CursorAdorner : Adorner
{
private readonly UIElement _adorningElement;
private VisualCollection _visualChildren;
private Canvas _cursorCanvas;
protected FrameworkElement _cursor;
Storyboard _gradientStopAnimationStoryboard;
// default cursor colors
readonly static Color _backColor = Colors.White;
readonly static Color _foreColor = Colors.Gray;
public CursorAdorner(FrameworkElement adorningElement)
: base(adorningElement)
{
this._adorningElement = adorningElement;
CreateCursorAdorner();
this.IsHitTestVisible = false;
}
public CursorAdorner(FrameworkElement adorningElement, FrameworkElement innerCursor)
: base(adorningElement)
{
this._adorningElement = adorningElement;
CreateCursorAdorner(innerCursor);
this.IsHitTestVisible = false;
}
public FrameworkElement CursorVisual
{
get
{
return _cursor;
}
}
public void CreateCursorAdorner()
{
var innerCursor = CreateCursor();
CreateCursorAdorner(innerCursor);
}
protected FrameworkElement CreateCursor()
{
var brush = new LinearGradientBrush();
brush.EndPoint = new Point(0, 1);
187
www.it-ebooks.info
CHAPTER6GESTURES
}
brush.StartPoint = new Point(0, 0);
brush.GradientStops.Add(new GradientStop(_backColor, 1));
brush.GradientStops.Add(new GradientStop(_foreColor, 1));
var cursor = new Ellipse()
{
Width = 50,
Height = 50,
Fill = brush
};
return cursor;
public void CreateCursorAdorner(FrameworkElement innerCursor)
{
_visualChildren = new VisualCollection(this);
_cursorCanvas = new Canvas();
_cursor = innerCursor;
_cursorCanvas.Children.Add(_cursor);
_visualChildren.Add(this._cursorCanvas);
AdornerLayer layer = AdornerLayer.GetAdornerLayer(_adorningElement);
layer.Add(this);
}
}
// . . .
BecauseweareinheritingfromtheAdornerbaseclass,wemustalsooverridecertainmethodsofthe
baseclass.Listing6-9demonstrateshowthebaseclassmethodsaretiedtothe_visualChildrenand
_cursorCanvasfieldsweinstantiateintheCreateCursorAdornermethodabove.
188
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-9.AdornerBaseClassMethodOverrides
protected override int VisualChildrenCount
{
get
{
return _visualChildren.Count;
}
}
protected override Visual GetVisualChild(int index)
{
return _visualChildren[index];
}
protected override Size MeasureOverride(Size constraint)
{
this._cursorCanvas.Measure(constraint);
return this._cursorCanvas.DesiredSize;
}
protected override Size ArrangeOverride(Size finalSize)
{
this._cursorCanvas.Arrange(new Rect(finalSize));
return finalSize;
}
Thecursoradornerisalsoresponsibleforfindingitscorrectlocation.ThebasicUpdateCursor
methodshowninListing6-10takesanXandYcoordinateposition.Itthenoffsetsthispositiontoensure
thatthecenterofthecursorimageislocatedovertheseXandYcoordinatesratherthanatcornerofthe
image.Additionally,weprovideanoverloadoftheUpdateCursormethodthattellsthecursoradorner
thatspecialcoordinateswillbepassedtotheadornerandthatallnormalcallstotheUpdateCursor
methodshouldbeignored.Thiswillbeusefullaterwhenwewanttoignorebasictrackinginthemagnet
buttoncontrolinordertoprovideabettergesturalexperiencefortheuser.
189
www.it-ebooks.info
z
CHAPTER6GESTURES
Listing6-10.PassingCoordinatePositionstotheCursorAdorner
public void UpdateCursor(Point position, bool isOverride)
{
_isOverridden = isOverride;
_cursor.SetValue(Canvas.LeftProperty, position.X - (_cursor.ActualWidth / 2));
_cursor.SetValue(Canvas.TopProperty, position.Y - (_cursor.ActualHeight / 2));
}
public void UpdateCursor(Point position)
{
if (_isOverridden)
return;
}
_cursor.SetValue(Canvas.LeftProperty, position.X - (_cursor.ActualWidth / 2));
_cursor.SetValue(Canvas.TopProperty, position.Y - (_cursor.ActualHeight / 2));
Finally,wewilladdmethodstoanimatethecursorvisualelement.WithKinectcontrolsthatrequire
hoveringoveranelement,itisusefultoprovidefeedbackinformingtheuserthatsomethingis
happeningwhileshewaits.Listing6-11showsthecodeforprogrammaticallyanimatingourdefault
cursorframeworkelement.
190
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-11.CursorAnimations
public virtual void AnimateCursor(double milliSeconds)
{
CreateGradientStopAnimation(milliSeconds);
if (_gradientStopAnimationStoryboard!= null)
_gradientStopAnimationStoryboard.Begin(this, true);
}
public virtual void StopCursorAnimation()
{
if(_gradientStopAnimationStoryboard != null)
_gradientStopAnimationStoryboard.Stop(this);
}
protected virtual void CreateGradientStopAnimation(double milliSeconds)
{
NameScope.SetNameScope(this, new NameScope());
var cursor = _cursor as Shape;
if (cursor == null) return;
var brush = cursor.Fill as LinearGradientBrush;
var stop1 = brush.GradientStops[0];
var stop2 = brush.GradientStops[1];
this.RegisterName("GradientStop1", stop1);
this.RegisterName("GradientStop2", stop2);
DoubleAnimation offsetAnimation = new DoubleAnimation();
offsetAnimation.From = 1.0;
offsetAnimation.To = 0.0;
offsetAnimation.Duration = TimeSpan.FromMilliseconds(milliSeconds);
Storyboard.SetTargetName(offsetAnimation, "GradientStop1");
Storyboard.SetTargetProperty(offsetAnimation,
new PropertyPath(GradientStop.OffsetProperty));
DoubleAnimation offsetAnimation2 = new DoubleAnimation();
offsetAnimation2.From = 1.0;
offsetAnimation2.To = 0.0;
offsetAnimation2.Duration = TimeSpan.FromMilliseconds(milliSeconds);
Storyboard.SetTargetName(offsetAnimation2, "GradientStop2");
Storyboard.SetTargetProperty(offsetAnimation2,
new PropertyPath(GradientStop.OffsetProperty));
_gradientStopAnimationStoryboard = new Storyboard();
_gradientStopAnimationStoryboard.Children.Add(offsetAnimation);
_gradientStopAnimationStoryboard.Children.Add(offsetAnimation2);
_gradientStopAnimationStoryboard.Completed +=
delegate { _gradientStopAnimationStoryboard.Stop(this); };
}
191
www.it-ebooks.info
CHAPTER6GESTURES
InordertoimplementtheKinectCursorManagerclass,weneedseveralhelpermethods,asshownin
Listing6-12.TheGetElementAtScreenPointmethodtellsuswhatWPFelementislocateddirectlyunder
theXandYcoordinatepassedtoit.Inthishighlydecoupledarchitecture,theGetElementAtScreenPoint
methodisourmainengineforpassingmessagesfromtheKinectCursorManagertocustomcontrolsthat
arereceptivetotheseevents.Additionally,weusetwomethodsfordeterminingtheskeletonwewantto
trackaswellasthehandwewanttotrack.
Listing6-12.KinectCursorManagerHelperMethods
private static UIElement GetElementAtScreenPoint(Point point, Window window)
{
if (!window.IsVisible)
return null;
Point windowPoint = window.PointFromScreen(point);
IInputElement element = window.InputHitTest(windowPoint);
if (element is UIElement)
return (UIElement)element;
else
return null;
}
private static Skeleton GetPrimarySkeleton(IEnumerable<Skeleton> skeletons)
{
Skeleton primarySkeleton = null;
foreach (Skeleton skeleton in skeletons)
{
if (skeleton.TrackingState != SkeletonTrackingState.Tracked)
{
continue;
}
if (primarySkeleton == null)
primarySkeleton = skeleton;
else if (primarySkeleton.Position.Z > skeleton.Position.Z)
primarySkeleton = skeleton;
}
}
return primarySkeleton;
private static Joint? GetPrimaryHand(Skeleton skeleton)
{
Joint leftHand = skeleton.Joints[JointType.HandLeft];
Joint rightHand = skeleton.Joints[JointType.HandRight];
if (rightHand.TrackingState == JointTrackingState.Tracked)
{
if (leftHand.TrackingState != JointTrackingState.Tracked)
return rightHand;
else if (leftHand.Position.Z > rightHand.Position.Z)
return rightHand;
192
www.it-ebooks.info
CHAPTER6GESTURES
else
return leftHand;
}
if (leftHand.TrackingState == JointTrackingState.Tracked)
return leftHand;
else
return null;
}
TheKinectCursorManageritselfisasingletonclass.Itisdesignedthiswayinordertomake
instantiatingitlesscomplicated.AnycontrolthatworkswiththeKinectCursorManagercan
independentlyinstantiatetheKinectCursorManagerifithasnotalreadybeeninstantiated.Thismeans
thatanydeveloperusingoneofthesecontrolsdoesnotneedtoknowanythingaboutthe
KinectCursorManageritself.Instead,developerscansimplydroponeofthesecontrolsintotheir
applicationandthecontrolwilltakecareofinstantiatingtheKinectCursorManager.TomakethisselfservetypeofcontrolworkwiththeKinectCursorManagerclass,wehavetocreateseveraloverloaded
CreatemethodsinordertopassinthemainWindowclassoftheapplication.Listing6-13illustratesthe
overloadedconstructorsaswellasourparticularsingletonimplementation.
Listing6-13.KinectCursorManagerConstructors
public class KinectCursorManager
{
private
private
private
private
private
private
private
KinectSensor _kinectSensor;
CursorAdorner _cursorAdorner;
readonly Window _window;
UIElement _lastElementOver;
bool _isSkeletonTrackingActivated;
static bool _isInitialized;
static KinectCursorManager _instance;
public static void Create(Window window)
{
if (!_isInitialized)
{
_instance = new KinectCursorManager(window);
_isInitialized = true;
}
}
public static void Create(Window window, FrameworkElement cursor)
{
if (!_isInitialized)
{
_instance = new KinectCursorManager(window, cursor);
_isInitialized = true;
}
}
public static void Create(Window window, KinectSensor sensor)
193
www.it-ebooks.info
CHAPTER6GESTURES
{
}
if (!_isInitialized)
{
_instance = new KinectCursorManager(window, sensor);
_isInitialized = true;
}
public static void Create(Window window, KinectSensor sensor, FrameworkElement cursor)
{
if (!_isInitialized)
{
_instance = new KinectCursorManager(window, sensor, cursor);
_isInitialized = true;
}
}
public static KinectCursorManager Instance
{
get { return _instance; }
}
private KinectCursorManager(Window window)
: this(window, KinectSensor.KinectSensors[0])
{
}
private KinectCursorManager(Window window, FrameworkElement cursor)
: this(window, KinectSensor.KinectSensors[0], cursor)
{
}
private KinectCursorManager(Window window, KinectSensor sensor)
: this(window, runtime, null)
{
}
private KinectCursorManager(Window window, KinectSensor sensor, FrameworkElement cursor)
{
this._window = window;
// ensure kinects are present
if (KinectSensor.KinectSensors.Count > 0)
{
_window.Unloaded += delegate
{
if (this._kinectSensor.SkeletonStream.IsEnabled)
this._ kinectSensor.SkeletonStream.Disable();
194
www.it-ebooks.info
CHAPTER6GESTURES
};
_kinectSensor.Stop();
_window.Loaded += delegate
{
if (cursor == null)
_cursorAdorner = new CursorAdorner((FrameworkElement)window.Content);
else
_cursorAdorner =
new CursorAdorner((FrameworkElement)window.Content,cursor);
this._kinectSensor = sensor;
this._kinectSensor.SkeletonFrameReady += SkeletonFrameReady;
this._kinectSensor.SkeletonStream.Enable(new TransformSmoothParameters());
this._kinectSensor.Start();
}
}
};
}
// . . .
Listing6-14showshowtheKinectCursorManagerinteractswithvisualelementsintheWindow
object.Astheuser’shandpassesovervariouselementsintheapplication,thecursormanager
constantlykeepstrackofthecurrentelementundertheuser’sprimaryhandaswellastheprevious
elementundertheuser’shand.Whenthischanges,themanagertriestothrowtheleaveeventonthe
previouscontrolandtheentereventonthecurrentone.WealsokeeptrackoftheKinectSensorobject
andthrowactivatedanddeactivatedasappropriate.
195
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-14.KinectCursorManagerEventManagement
private void SetSkeletonTrackingActivated()
{
if (_lastElementOver != null && _isSkeletonTrackingActivated == false)
{ _lastElementOver.RaiseEvent(
new RoutedEventArgs(KinectInput.KinectCursorActivatedEvent)); };
_isSkeletonTrackingActivated = true;
}
private void SetSkeletonTrackingDeactivated()
{
if (_lastElementOver != null && _isSkeletonTrackingActivated == true)
{ _lastElementOver.RaiseEvent(
new RoutedEventArgs(KinectInput.KinectCursorDeactivatedEvent)); };
_isSkeletonTrackingActivated = false;
}
private void HandleCursorEvents(Point point, double z)
{
UIElement element = GetElementAtScreenPoint(point, _window);
if (element != null)
{
element.RaiseEvent(new KinectCursorEventArgs(KinectInput.KinectCursorMoveEvent
, point, z)
{ Cursor = _cursorAdorner });
if (element != _lastElementOver)
{
if (_lastElementOver != null)
{
_lastElementOver.RaiseEvent(
new KinectCursorEventArgs(KinectInput.KinectCursorLeaveEvent
, point, z)
{ Cursor = _cursorAdorner });
}
}
element.RaiseEvent(
new KinectCursorEventArgs(KinectInput.KinectCursorEnterEvent, point, z)
{ Cursor = _cursorAdorner });
}
_lastElementOver = element;
}
Finally,wecanwritethetwomethodsattheheartofmanagingtheKinectCursorManagerclass.The
SkeletonFrameReadymethodisthestandardeventhandlerforskeletonframesfromKinect.Inthis
project,theSkeletonFrameReadymethodtakescareofgrabbingtheappropriateskeletonandthenthe
appropriatehand.ItthenpassesthehandjointitfindstotheUpdateCursormethod.UpdateCursor
performsthedifficulttaskoftranslatingKinectskeletoncoordinatestocoordinatevaluesunderstoodby
196
www.it-ebooks.info
CHAPTER6GESTURES
WPF.TheMapSkeletonPointToDepthmethodprovidedbytheKinectSDKisusedtoperformthispartof
theway.TheXandYvaluesreturnedfromtheSkeletonToDepthImagemethodisthenadjustedforthe
actualheightandwidthoftheapplicationwindow.TheZpositionisscaleddifferently.Itissimply
passedasmillimetersfromtheKinectdepthcamera.AsshowninListing6-15,oncethesecoordinates
areidentified,theyarepassedtotheHandleCursorEventsmethodandthentothecursoradorneritselfin
ordertoprovideaccuratefeedbacktotheuser.
197
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-15.TranslatingKinectDataintoWPFData
private void SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
using (SkeletonFrame frame = e.OpenSkeletonFrame())
{
if (frame == null || frame.SkeletonArrayLength == 0)
return;
Skeleton[] skeletons = new Skeleton[frame.SkeletonArrayLength];
frame.CopySkeletonDataTo(skeletons);
Skeleton skeleton = GetPrimarySkeleton(skeletons);
if (skeleton == null)
{
SetHandTrackingDeactivated();
}
else
{
}
}
Joint? primaryHand = GetPrimaryHand(skeleton);
if (primaryHand.HasValue)
{
UpdateCursor(primaryHand.Value);
}
else
{
SetHandTrackingDeactivated();
}
}
private void UpdateCursor(Joint hand)
{
var point = _kinectSensor.MapSkeletonPointToDepth(hand.Position,
_kinectSensor.DepthStream.Format);
float x = point.X;
float y = point.Y;
float z = point.Depth;
x = (float)(x * _window.ActualWidth/_kinectSensor.DepthStream.FrameWidth
y = (float)(y * _window.ActualHeight/_kinectSensor.DepthStream.FrameHeight
Point cursorPoint = new Point(x, y);
HandleCursorEvents(cursorPoint, z);
_cursorAdorner.UpdateCursor(cursorPoint);
}
198
www.it-ebooks.info
CHAPTER6GESTURES
Sofar,wehavesimplywrittenalotofinfrastructurethatdoeslittlemorethanmoveacursoraround
thescreenbasedonauser’shandmovements.Wewillnowbuildabaseclassthatlistensforeventsfrom
thecursor.CreateanewclasscalledKinectButtonandinheritfromtheWPFButtontype.Wewilltake
threeoftheeventswepreviouslycreatedintheKinectInputclassandrecreatetheminour
KinectButton.AsyouseeinListing6-16,wewillalsocreateaddandremovemethodsfortheseevents.
Listing6-16.TheKinectButtonBaseClass
public class KinectButton: Button
{
public static readonly RoutedEvent KinectCursorEnterEvent =
KinectInput.KinectCursorEnterEvent.AddOwner(typeof(KinectButton));
public static readonly RoutedEvent KinectCursorLeaveEvent =
KinectInput.KinectCursorLeaveEvent.AddOwner(typeof(KinectButton));
public static readonly RoutedEvent KinectCursorMoveEvent =
KinectInput.KinectCursorMoveEvent.AddOwner(typeof(KinectButton));
public event KinectCursorEventHandler KinectCursorEnter
{
add { base.AddHandler(KinectCursorEnterEvent, value); }
remove { base.RemoveHandler(KinectCursorEnterEvent, value); }
}
public event KinectCursorEventHandler KinectCursorLeave
{
add { base.AddHandler(KinectCursorLeaveEvent, value); }
remove { base.RemoveHandler(KinectCursorLeaveEvent, value); }
}
public event KinectCursorEventHandler KinectCursorMove
{
add { base.AddHandler(KinectCursorMoveEvent, value); }
remove { base.RemoveHandler(KinectCursorMoveEvent, value); }
}
}
// . . .
IntheconstructorfortheKinectButton,wechecktoseeifthecontrolisrunninginanIDEorinan
actualapplication.Ifitisnotrunninginadesigner,thenwehavetheKinectCursorManagerinstantiate
itselfifithasnotalreadydoneso.Inthisway,asexplainedabove,wecanhavemultipleKinectbuttons
inthesamewindowandtheywillworkoutamongthemselveswhichonewillcreatethe
KinectCursorManagerinstanceautomaticallywithouteverbotheringthedeveloperaboutit.Listing6-17
demonstrateshowthisoccursaswellashowtheeventsfromListing6-16areconnectedtobaseevent
handlingmethods.TheHandleCursorEventsmethodintheKinectCursorManagertakescareofcalling
theseevents.
199
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-17.KinectButtonBaseImplementation
public KinectButton()
{
if (!System.ComponentModel.DesignerProperties.GetIsInDesignMode(this))
KinectCursorManager.Create(Application.Current.MainWindow);
this.KinectCursorEnter += new KinectCursorEventHandler(OnKinectCursorEnter);
this.KinectCursorLeave += new KinectCursorEventHandler(OnKinectCursorLeave);
this.KinectCursorMove += new KinectCursorEventHandler(OnKinectCursorMove);
}
protected virtual void OnKinectCursorLeave(object sender, KinectCursorEventArgs e)
{}
protected virtual void OnKinectCursorMove(object sender, KinectCursorEventArgs e)
{}
Thenextpieceofcode,showninListing6-18,willmaketheKinectButtonusable.Wewillhookup
theKinectCursorEntereventtoastandardclickevent.TheinitialinteractionidiomsforKinect
applicationscontinuetodrawonGUImetaphors.Thiscanbeeasierforuserstounderstand.Justas
important,however,itcanbeeasierfordeveloperstounderstand.Asdevelopers,wehavehadovera
decadeofexperiencewithlayingoutuserinterfacesusingbuttons.Whiletheultimategoalistomove
awayfromthesetypesofcontrolsandtowardsaninterfacethatusespuregestures,buttonsarestill
extremelyusefulfornowaswewrapourheadsaroundnewformsofnaturaluserinterfaces.
Additionally,italsomakesiteasytotakeastandardinterfacebuiltforgraphicaluserinterfacesand
simplyreplacebuttonswithKinectbuttons.
Listing6-18.AddingaClicktotheKinectButton
protected virtual void OnKinectCursorEnter(object sender, KinectCursorEventArgs e)
{
RaiseEvent(new RoutedEventArgs(ClickEvent));
}
Thegreatproblemwiththissortofcontrol—andthereasonyouwillnotseeitinverymanyKinect
applications—isthatitisnotabletodistinguishbetweenintentionalandaccidentalhits.Ithasthesame
liabilitiesthatatraditionalmouse-basedGUIapplicationwouldhaveifeverypassofthecursorovera
buttoncausedthebuttontoactivateevenwithoutamouseclick.Suchaninterfaceissimplyunusable
andhighlightstheunderlyingdifficultlyofmigratingidiomsfromthegraphicaluserinterfacetoother
mediums.ThehoverbuttonwasMicrosoft’sfirstattempttosolvethisparticularproblem.
HoverButton
Thehoverbuttonwasintroducedin2010withtheXboxdashboardrevampforKinect.Thehoverbutton
solvestheproblemofaccidentalhitdetectionbyreplacingamouseclickwithahover-and-waitaction.
Whenthecursorpassesoverabutton,theuserindicatesthathewantstoselectthebuttonbywaitingfor
acoupleofsecondswiththecursorhoveringoverit.Anadditionalkeyfeatureofthehoverbuttonisthat
itprovidesvisualfeedbackduringthiswaitingbyanimatingthecursorinsomefashion.
ThetechniqueforimplementingahoverbuttonforKinectissimilartoonedevelopershadtousein
ordertoimplementthetap-and-holdgestureonWindowsPhonewhenitwasinitiallyreleasedwithvery
200
www.it-ebooks.info
CHAPTER6GESTURES
littlegesturesupport.Atimermustbecreatedtotrackhowlongauserhaspausedoverthebutton.The
timerstartsrunningoncetheuser’shandcrossesthebutton’sborders.Iftheuser’shandleavesthe
buttonbeforethetimerhasfinished,thenthetimerisstopped.Ifthetimerfinishesbeforetheuser’s
handleavesthebutton,thenaclickeventisthrown.
CreateanewclassinyourcontrollibrarycalledHoverButton.HoverButtonwillinheritfromthe
KinectButtonclasswehavealreadycreated.Addafieldnamed_hoverTimertoholdtheDispatcherTimer
instance,asshowninListing6-19.Additionally,createaprotectedBoolean_timerEnabledfieldandsetit
totrue.Wewillnotbeusingthisfieldimmediately,butitwillbeveryimportantinlatersectionsofthis
chapterwhenwewanttobeabletousecertainfeaturesofourHoverButtonbutneedtodeactivatethe
DispatcherTimeritself.Finally,wewillcreateaHoverIntervaldependencypropertythatwillallow
developerstodefinethedurationofthehoverineithercodeorXAML.Thiswilldefaulttotwoseconds,
whichappearstobethestandardhoverdurationformostXboxtitles.
Listing6-19.TheBasicHoverButtonImplementation
public class HoverButton : KinectButton
{
readonly DispatcherTimer _hoverTimer = new DispatcherTimer();
protected bool _timerEnabled = true;
public double HoverInterval
{
get { return (double)GetValue(HoverIntervalProperty); }
set { SetValue(HoverIntervalProperty, value); }
}
public static readonly DependencyProperty HoverIntervalProperty =
DependencyProperty.Register("HoverInterval", typeof(double)
, typeof(HoverButton), new UIPropertyMetadata(2000d));
// . . .
}
Toimplementtheheartofthehoverbutton’sfunctionality,weoverridetheOnKinectCursorLeave
andOnKinectCursorEntermethodsinourbaseclass.Allthenecessaryinteractionwiththe
KinectCursorManagerhasalreadybeentakencareofintheKinectButtonclass,sowedonothaveto
worryaboutit.Intheconstructormethod,justinitializetheDispatcherTimerwiththeHoverInterval
dependencypropertyandattachaneventhandlercalled_hoverTimer_Ticktothetimer’sTickevent.
Tickistheeventthatisthrownbythetimerwhentheintervaldurationhasrunitscourse.Theevent
handlerwillsimplythrowastandardClickevent.IntheOnKinectCursorEntermethod,startourtimer.In
theOnKinectCursorLeavemethod,stopit.Additionally—andthisisimportant—startandstopthecursor
animationintheenterandleavemethods.TheanimationitselfwillbeshelledouttotheCursorAdorner
object.
201
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-20.TheHeartoftheHoverButton
public HoverButton()
{
_hoverTimer.Interval = TimeSpan.FromMilliseconds(HoverInterval);
_hoverTimer.Tick += _hoverTimer_Tick;
_hoverTimer.Stop();
}
void _hoverTimer_Tick(object sender, EventArgs e)
{
_hoverTimer.Stop();
RaiseEvent(new RoutedEventArgs(ClickEvent));
}
override protected void OnKinectCursorLeave(object sender, KinectCursorEventArgs e)
{
if (_timerEnabled)
{
e.Cursor.StopCursorAnimation();
_hoverTimer.Stop();
}
}
override protected void OnKinectCursorEnter(object sender, KinectCursorEventArgs e)
{
if (_timerEnabled)
{
_hoverTimer.Interval = TimeSpan.FromMilliseconds(HoverInterval);
e.Cursor.AnimateCursor(HoverInterval);
_hoverTimer.Start();
}
}
ThehoverbuttonquicklybecameubiquitousinKinectapplicationsfortheXbox.Oneofthe
problemswithitthatwaseventuallydiscovered,however,wasthatthecursorhandhadatendencyto
becomeslightlyjitterywhenitpausedoverabutton.ThisismaybeanartifactoftheKinectskeleton
recognitionsoftwareitself.Kinectisverygoodatsmoothingoutskeletonswhentheyareinmotion,
becauseitusesavarietyofpredictiveandsmoothingtechniquestoevenoutquickmotions.Poses,
however,seemtogiveitproblems.Additionally,andmoretothepoint,peoplearesimplynotgoodat
keepingtheirhandsmotionlessevenwhentheythinktheyaredoingso.Kinectpicksupontheseslight
movementsandmirrorsthembacktotheuserasfeedback.Ajitteryhandisdisconcertingwhenthe
intentoftheuseristodoabsolutelynothingandthiscanunderminethefeedbackprovidedbythe
cursoranimation.Animprovementonthehoverbutton,calledthemagnetbutton,eventuallyreplaced
thehoverbuttoninsubsequentgameupdatesandeventuallymadeitswaytotheXboxdashboardas
well.Wewilldiscusshowtoimplementthemagnetbuttonlater.
202
www.it-ebooks.info
CHAPTER6GESTURES
PushButton
EvenasthehoverbuttonanditsvariantsbecamecommonontheXbox,hackersbuildingapplications
forthePCcreatedanalternativeinteractionidiomcalledthepushbutton.Thepushbuttonattemptsto
translatethetraditionalGUIbuttontoKinectinamuchmoreliteralfashionthanthehoverbuttondoes.
Toreplacetheclickofamouse,thepushbuttonusesaforwardpressingofthehandintotheairinfront
oftheuser.
Whilethismovementofthehand,oftenwithpalmopenandfacingforward,issymbolicallysimilar
toourkineticexperienceofthemouse,itisalsoabitdisconcerting.Itfeelslikeafailedattempttodoa
highfivewithsomeoneandIalwayshavethesensethatIhavebeenlefthangingafterperformingthis
maneuver.Furthermore,itleavestheuserslightlyoffbalancewhenhehasperformedthegesture
correctly.Needlesstosay,Iamnotafanofthegesture.
Hereishowyouimplementthepushbutton.Thecorealgorithmofthepushbuttondetectsa
negativemovementofthehandalongtheZ-axis.Additionally,themovementshouldexceedacertain
distancethresholdinordertoregister.AsListing6-21illustrates,ourpushbuttonwillhavea
dependencypropertycalledThreshold,measuredinmillimeters,allowingthedevelopertodetermine
howsensitivethepushbuttonwillbe.Whenthehandcursorpassesoverthepushbutton,wewilltakea
snapshotoftheZpositionofthehand.Wesubsequentlycomparetheinitialhanddepthagainstthe
thresholdandwhenthatthresholdhasbeenexceeded,aclickeventisthrown.
203
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-21.ASimplePushButton
public class PushButton: KinectButton
{
protected double _handDepth;
public double PushThreshold
{
get { return (double)GetValue(PushThresholdProperty); }
set { SetValue(PushThresholdProperty, value); }
}
public static readonly DependencyProperty PushThresholdProperty =
DependencyProperty.Register("PushThreshold", typeof(double),
typeof(PushButton), new UIPropertyMetadata(100d));
protected override void OnKinectCursorMove(object sender, KinectCursorEventArgs e)
{
if (e.Z < _handDepth - PushThreshold)
{
RaiseEvent(new RoutedEventArgs(ClickEvent));
}
}
protected override void OnKinectCursorEnter(object sender, KinectCursorEventArgs e)
{
_handDepth = e.Z;
}
}
MagnetButton
Themagnetbuttonistheimprovedhoverbuttondiscussedpreviously.Itsroleissimplytosubtly
improvetheuserexperiencewhenhoveringoverabutton.Itinterceptsthetrackingpositionofthehand
andautomaticallysnapsthecursortothecenterofthemagnetbutton.Whentheuser’shandleavesthe
areaofthemagnetbutton,thecursorisallowedtotracknormallyagain.Inallotherways,themagnet
buttonbehavesexactlythesameasahoverbutton.Giventhatthefunctionaldeltabetweenthemagnet
buttonandthehoverbuttonissosmall,itmayseemstrangethatwearetreatingitasacompletely
differentcontrol.InUXterms,however,itisanentirelydifferentbeast.Fromacodingperspective,as
youwillsee,itisalsomorecomplexbyanorderofmagnitude.
BeginbycreatinganewclasscalledMagnetButtonthatinheritsfromHoverButton.Themagnet
buttonwillrequiresomeadditionaleventsandpropertiestogoverntheperiodbetweenwhenthehand
cursorenterstheareaofthemagnetbuttonandwhenthehandhasactuallysnappedintoplace.We
needtoaddthesenewlockandunlockeventstotheKinectInputclass,asshowninListing6-22.
204
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-22.AddingLockandUnlockEventstoKinectInput
public static readonly RoutedEvent KinectCursorLockEvent =
EventManager.RegisterRoutedEvent("KinectCursorLock", RoutingStrategy.Bubble,
typeof(KinectCursorEventHandler), typeof(KinectInput));
public static void AddKinectCursorLockHandler(DependencyObject o, KinectCursorEventHandler
handler)
{
((UIElement)o).AddHandler(KinectCursorLockEvent, handler);
}
public static readonly RoutedEvent KinectCursorUnlockEvent =
EventManager.RegisterRoutedEvent("KinectCursorUnlock", RoutingStrategy.Bubble,
typeof(KinectCursorEventHandler), typeof(KinectInput));
public static void AddKinectCursorUnlockHandler(DependencyObject o, KinectCursorEventHandler
handler)
{
((UIElement)o).AddHandler(KinectCursorUnlockEvent, handler);
}
TheseeventscannowbeaddedtotheMagnetButtonclass,whichisdemonstratedinListing6-23.
Additionally,wewillalsoaddtwodependencypropertiesgoverninghowlongittakesforlockingand
unlockingtooccur.
205
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-23.MagnetButtonEventsandProperties
public class MagnetButton : HoverButton
{
public static readonly RoutedEvent KinectCursorLockEvent =
KinectInput.KinectCursorUnlockEvent.AddOwner(typeof(MagnetButton));
public static readonly RoutedEvent KinectCursorUnlockEvent =
KinectInput.KinectCursorLockEvent.AddOwner(typeof(MagnetButton));
public event KinectCursorEventHandler KinectCursorLock
{
add { base.AddHandler(KinectCursorLockEvent, value); }
remove { base.RemoveHandler(KinectCursorLockEvent, value); }
}
public event KinectCursorEventHandler KinectCursorUnlock
{
add { base.AddHandler(KinectCursorUnlockEvent, value); }
remove { base.RemoveHandler(KinectCursorUnlockEvent, value); }
}
public double LockInterval
{
get { return (double)GetValue(LockIntervalProperty); }
set { SetValue(LockIntervalProperty, value); }
}
public static readonly DependencyProperty LockIntervalProperty =
DependencyProperty.Register("LockInterval", typeof(double)
, typeof(MagnetButton), new UIPropertyMetadata(200d));
public double UnlockInterval
{
get { return (double)GetValue(UnlockIntervalProperty); }
set { SetValue(UnlockIntervalProperty, value); }
}
public static readonly DependencyProperty UnlockIntervalProperty =
DependencyProperty.Register("UnlockInterval", typeof(double)
, typeof(MagnetButton), new UIPropertyMetadata(80d));
}
// . . .
Attheheartofthemagnetbuttonisthecodetomovethecursorfromitscurrentpositiontoits
intendedpositionatthecenterofthemagnetbutton.Thisisactuallyabithairierthanitmightseemat
firstblush.OverridetheOnKinectCursorEnterandOnKinectCursorLeavemethodsofthebaseclass.The
firststepindeterminingthelockpositionforthemagnetbuttonistofindthecurrentpositionofthe
buttonitself,asshowninListing6-24.WedothisbyusinganextremelycommonWPFhelpermethod
calledFindAncestorthatrecursivelycrawlsthevisualtreetofindthings.Thegoalistocrawlthevisual
treetofindtheWindowobjecthostingthemagnetbutton.Matchthemagnetbutton’scurrentinstanceto
206
www.it-ebooks.info
CHAPTER6GESTURES
thewindowandassignittoavariablecalledpoint.However,pointonlycontainsthelocationofthe
upper-leftcornerofthecurrentmagnetbutton.Instead,we’llneedtooffsetthisbyhalfthecontrol’s
widthandhalfthecontrol’sheightinordertofindthecenterofthecurrentmagnetbutton.This
providesuswithtwovalues:xandy.
Listing6-24.HeartoftheMagnetButton
private T FindAncestor<T>(DependencyObject dependencyObject)
where T : class
{
DependencyObject target = dependencyObject;
do
{
target = VisualTreeHelper.GetParent(target);
}
while (target != null && !(target is T));
return target as T;
}
protected override void OnKinectCursorEnter(object sender, KinectCursorEventArgs e)
{
// get button position
var rootVisual = FindAncestor<Window>(this);
var point = this.TransformToAncestor(rootVisual)
.Transform(new Point(0, 0));
var x = point.X + this.ActualWidth / 2;
var y = point.Y + this.ActualHeight / 2;
var cursor = e.Cursor;
cursor.UpdateCursor(new Point(e.X, e.Y), true);
// find target position
Point lockPoint = new Point(x - cursor.CursorVisual.ActualWidth / 2
, y - cursor.CursorVisual.ActualHeight / 2 );
// find current location
Point cursorPoint = new Point(e.X - cursor.CursorVisual.ActualWidth / 2
, e.Y - cursor.CursorVisual.ActualHeight / 2);
}
// guide cursor to its final position
AnimateCursorToLockPosition(e, x, y, cursor, ref lockPoint, ref cursorPoint);
base.OnKinectCursorEnter(sender, e);
protected override void OnKinectCursorLeave(object sender, KinectCursorEventArgs e)
{
base.OnKinectCursorLeave(sender, e);
e.Cursor.UpdateCursor(new Point(e.X, e.Y), false);
207
www.it-ebooks.info
CHAPTER6GESTURES
//get button position
var rootVisual = FindAncestor<Window>(this);
var point = this.TransformToAncestor(rootVisual)
.Transform(new Point(0, 0));
var x = point.X + this.ActualWidth / 2;
var y = point.Y + this.ActualHeight / 2;
var cursor = e.Cursor;
// find target position
Point lockPoint = new Point(x - cursor.CursorVisual.ActualWidth / 2
, y - cursor.CursorVisual.ActualHeight / 2 );
// find current location
Point cursorPoint = new Point(e.X - cursor.CursorVisual.ActualWidth / 2
, e.Y - cursor.CursorVisual.ActualHeight / 2);
}
// guide cursor to its final position
AnimateCursorAwayFromLockPosition(e, cursor, ref lockPoint, ref cursorPoint);
Next,weupdatethecursoradornerwiththecurrenttrueXandYpositionsoftheuser’shand.We
alsopassasecondparameter,though,whichtellsthecursoradornerthatitshouldstopautomatically
trackingtheuser’shandforawhile.Itwouldbeveryannoyingif,aftergoingtothetroubleofcentering
thehandcursor,theautomatictrackingsimplytookoverandcontinuedmovingthehandaroundasit
pleased.
Eventhoughwenowhavethecenterpositionofthemagnetbutton,thisisstillnotsufficientfor
relocatingthehandcursor.Wehavetoadditionallyoffsetfortheheightandwidthofthecursoritselfin
ordertoensurethatitisthecenterofthecursor,ratherthanitstop-leftcorner,thatgetscentered.After
performingthisoperation,weassignthefinalvaluetothelockPointvariable.Wealsoperformasimilar
operationtofindtheoffsetofthecursor’scurrentupper-leftcornerandassignittothecursorPoint
variable.Giventhesetwopoints,wecannowanimatethecursorfromitspositionontheedgeofthe
buttontoitsintendedposition.ThisanimationisshelledouttotheAnimateCursorToLockPosition
methodshowninListing6-25.TheoverrideforOnKinectCursorLeaveismoreorlessthesameasthe
entercode,exceptinreverse.
208
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-25.AnimatingtheCursorOnLockandUnlock
private void AnimateCursorToLockPosition(KinectCursorEventArgs e, double x, double y,
CursorAdorner cursor, ref Point lockPoint,
ref Point cursorPoint)
{
DoubleAnimation moveLeft = new DoubleAnimation(cursorPoint.X, lockPoint.X
, new Duration(TimeSpan.FromMilliseconds(LockInterval)));
Storyboard.SetTarget(moveLeft, cursor.CursorVisual);
Storyboard.SetTargetProperty(moveLeft, new PropertyPath(Canvas.LeftProperty));
DoubleAnimation moveTop = new DoubleAnimation(cursorPoint.Y, lockPoint.Y
, new Duration(TimeSpan.FromMilliseconds(LockInterval)));
Storyboard.SetTarget(moveTop, cursor.CursorVisual);
Storyboard.SetTargetProperty(moveTop, new PropertyPath(Canvas.TopProperty));
_move = new Storyboard();
_move.Children.Add(moveTop);
_move.Children.Add(moveLeft);
}
_move.Completed += delegate
{
this.RaiseEvent(new KinectCursorEventArgs(KinectCursorLockEvent
, new Point(x, y), e.Z) { Cursor = e.Cursor });
};
if (_move != null)
_move.Stop(e.Cursor);
_move.Begin(cursor, false);
private void AnimateCursorAwayFromLockPosition(KinectCursorEventArgs e
, CursorAdorner cursor, ref Point lockPoint, ref Point cursorPoint)
{
DoubleAnimation moveLeft = new DoubleAnimation(lockPoint.X, cursorPoint.X
, new Duration(TimeSpan.FromMilliseconds(UnlockInterval)));
Storyboard.SetTarget(moveLeft, cursor.CursorVisual);
Storyboard.SetTargetProperty(moveLeft, new PropertyPath(Canvas.LeftProperty));
DoubleAnimation moveTop = new DoubleAnimation(lockPoint.Y, cursorPoint.Y
, new Duration(TimeSpan.FromMilliseconds(UnlockInterval)));
Storyboard.SetTarget(moveTop, cursor.CursorVisual);
Storyboard.SetTargetProperty(moveTop, new PropertyPath(Canvas.TopProperty));
_move = new Storyboard();
_move.Children.Add(moveTop);
_move.Children.Add(moveLeft);
_move.Completed += delegate
{
_move.Stop(cursor);
cursor.UpdateCursor(new Point(e.X, e.Y), false);
this.RaiseEvent(new KinectCursorEventArgs(KinectCursorUnlockEvent
, new Point(e.X, e.Y), e.Z) { Cursor = e.Cursor });
};
_move.Begin(cursor, true);
}
209
www.it-ebooks.info
CHAPTER6GESTURES
Inourlockandunlockanimations,wewaituntiltheanimationsarecompletedbeforethrowingthe
KinectCursorLockandKinectCursorUnlockevents.Inthemagnetbuttonitself,theseeventsareonly
minimallyuseful.Wecanusethemlateron,however,totriggeraffordancesforthemagneticslide
buttonwewillbuildontopofthecodewehavejustwritten.
Swipe
Theswipeisapuregesturelikethewave.Detectingaswiperequiresconstantlytrackingthemovements
oftheuser’shandandmaintainingabacklogofprevioushandpositions.Becausethegesturealsohasa
velocitythreshold,weneedtotrackmomentsintimeaswellascoordinatesinthree-dimensionalspace.
Listing6-26illustratesastructforstoringX,YandZcoordinatesandalsoatemporalcoordinate.Ifyou
arefamiliarwithvectorsingraphicsprogramming,youcanthinkofthisasafour-dimensionalvector.
Addthisstructtoyourcontrollibraryproject.
Listing6-26.TheGesturePointFour-DimensionalObject
public struct GesturePoint
{
public double X {get;set;}
public double Y {get;set;}
public double Z {get;set;}
public DateTime T {get;set;}
public override bool Equals(object obj)
{
var o = (GesturePoint)obj;
return (X == o.X) && (Y == o.Y) && (Z == o.Z) && (T == o.T);
}
public override int GetHashCode()
{
return base.GetHashCode();
}
}
WewillimplementswipegesturedetectionintheKinectCursorManagerweconstructedearliersowe
canreuseitlaterinthemagneticslidebutton.Listing6-27delineatesseveralnewfieldsthatshouldbe
addedtotheKinectCursorManagerinordertosupportswipedetection.The_gesturePointsfieldstores
ourongoingcollectionofpoints.Itshouldn’tgrowtoobig,though,aswewillbeconstantlyremoving
pointsaswellasaddingnewones.The_swipeTimeand_swipeDeviationprovidethresholdsofhowlong
aswipeshouldtakeaswellashowfaroffalongtheY-axisaswipecanstraybeforeweinvalidateit.Ifa
swipeisinvalidatedbecauseiteithertakestoolongorgoesastray,wewillremoveallpreviouspoints
fromthe_gesturePointslistandstartlookingfornewswipes._swipeLengthprovidesathresholdfora
successfulswipe.Weprovidetwoneweventsthatcanbehandled,indicatingthataswipehaseither
beenaccomplishedorinvalidated.SincethisisapuregesturethathasnothingtodowithGUIs,wewill
notbeusingaClickeventanywhereinthisimplementation.
210
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-27.SwipeGestureDetectionFields
private List<GesturePoint> _gesturePoints;
private bool _gesturePointTrackingEnabled;
private double _swipeLength, _swipeDeviation;
private int _swipeTime;
public event KinectCursorEventHandler SwipeDetected;
public event KinectCursorEventHandler SwipeOutOfBoundsDetected;
private double _xOutOfBoundsLength;
private static double _initialSwipeX;
_xOutOfBoundsLengthand_initialSwipeXareusedincasewewanttorootthestartofaswipetoa
particularlocation.Ingeneral,wedonotcarewhereagesturestartsbutjustlookforanysequenceof
pointsinthe_gesturePointslistforapatternmatch.Occasionally,however,itmaybeusefultoonly
lookforswipesthatbeginatagivenpoint,forinstanceattheedgeofthescreenifweareimplementing
horizontalscrolling.Inthiscase,wewillalsowantanoffsetthresholdbeyondwhichweignoreany
movementbecausethesemovementscouldnotpossiblygenerateswipesthatweareinterestedin.
Listing6-28providessomehelpermethodsandpublicpropertieswewillneedinordertomanage
gesturetracking.TheGesturePointTrackingInitializemethodallowsthedevelopertosetupthe
parametersforthesortofgesturetrackingthatwillbeperformed.Afterinitializingswipedetection,the
developeralsoneedstoturnitonwiththeGesturePointTrackingStartmethod.Naturally,thedeveloper
willalsorequireawaytoendswipedetectionwithaGesturePointTrackingStopmethod.Finally,we
providetwooverloadedhelpermethodsformanagingoursequenceofgesturepointscalled
ResetGesturePoint.Theseareusedtothrowawaypointsthatwenolongercareabout.
Listing6-28.SwipeGestureHelperMethodsandPublicProperties
public void GesturePointTrackingInitialize(double swipeLength
, double swipeDeviation, int swipeTime, double xOutOfBounds)
{
_swipeLength = swipeLength;
_swipeDeviation = swipeDeviation;
_swipeTime = swipeTime;
_xOutOfBoundsLength = xOutOfBounds;
}
public void GesturePointTrackingStart()
{
if (_swipeLength + _swipeDeviation + _swipeTime == 0)
throw (new InvalidOperationException("Swipe detection not initialized."));
_gesturePointTrackingEnabled = true;
}
public void GesturePointTrackingStop()
{
_xOutOfBoundsLength = 0;
_gesturePointTrackingEnabled = false;
_gesturePoints.Clear();
}
211
www.it-ebooks.info
CHAPTER6GESTURES
public bool GesturePointTrackingEnabled
{
get { return _gesturePointTrackingEnabled; }
}
private void ResetGesturePoint(GesturePoint point)
{
bool startRemoving = false;
for (int i = GesturePoints.Count; i >= 0; i--)
{
if (startRemoving)
GesturePoints.RemoveAt(i);
else
if (GesturePoints[i].Equals(point))
startRemoving = true;
}
}
private void ResetGesturePoint(int point)
{
if (point < 1)
return;
for (int i = point - 1; i >= 0; i--)
{
GesturePoints.RemoveAt(i);
}
}
ThecoreofourswipedetectionalgorithmiscontainedintheHandleGestureTrackingmethodfrom
Listing6-29.ThisshouldbehookeduptotheKinect’sskeletontrackingeventsbyplacingitinthe
UpdateCursormethodoftheKinectCursorManager.Everytimeitreceivesanewcoordinatepoint,the
HandleGestureTrackingmethodaddsthelatestGesturePointtotheGesturePointssequence,andthen
performsseveralchecks.First,itdeterminesifthenewpointdeviatestoofaralongtheY-axisfromthe
startofthegesture.Ifso,itthrowsanoutofboundseventandgetsridofalltheaccumulatedpoints.This
effectivelystartsswipedetectionoveragain.Next,itcheckstheintervalbetweenthestartofthegesture
andthecurrenttime.Ifthisisgreaterthantheswipethreshold,itgetsridoftheinitialgesturepoint,
makingthenextgesturepointintheseriestheinitialgesturepoint.Ifournewhandpositionhas
surviveduptothispointinthealgorithm,ithasdoneprettywell.Wenowchecktoseeifthedistance
betweentheinitialXpositionandthecurrentpositionisgreaterthanthethresholdweestablishedfora
successfulswipe.Ifitis,thenwearefinallyabletothrowtheSwipeDetectedevent.Ifitisnot,we
optionallychecktoseeifthecurrentXpositionexceedstheouterboundaryforswipedetectionand
throwthecorrectevent.ThenwewaitforanewhandpositiontobesenttotheHandleGestureTracking
method.Itshouldn’ttakelong.
212
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-29.HeartofSwipeDetection
private void HandleGestureTracking(float x, float y, float z)
{
if (!_gesturePointTrackingEnabled)
return;
// check to see if xOutOfBounds is being used
if (_xOutOfBoundsLength != 0 && _initialSwipeX == 0)
{
_initialSwipeX = x;
}
GesturePoint newPoint = new GesturePoint() { X = x, Y = y, Z = z, T = DateTime.Now };
GesturePoints.Add(newPoint);
GesturePoint startPoint = GesturePoints[0];
var point = new Point(x, y);
//check for deviation
if (Math.Abs(newPoint.Y - startPoint.Y) > _swipeDeviation)
{
if (SwipeOutOfBoundsDetected != null)
SwipeOutOfBoundsDetected(this, new KinectCursorEventArgs(point)
{ Z = z, Cursor = _cursorAdorner });
ResetGesturePoint(GesturePoints.Count);
return;
}
//check time
if ((newPoint.T - startPoint.T).Milliseconds > _swipeTime)
{
GesturePoints.RemoveAt(0);
startPoint = GesturePoints[0];
}
// check to see if distance has been achieved swipe left
if ((_swipeLength < 0 && newPoint.X - startPoint.X < _swipeLength)
// check to see if distance has been achieved swipe right
|| (_swipeLength > 0 && newPoint.X - startPoint.X > _swipeLength))
{
GesturePoints.Clear();
//throw local event
if (SwipeDetected != null)
SwipeDetected(this, new KinectCursorEventArgs(point)
{ Z = z, Cursor = _cursorAdorner });
return;
}
if (_xOutOfBoundsLength != 0 &&
((_xOutOfBoundsLength < 0 && newPoint.X - _initialSwipeX < _xOutOfBoundsLength)
|| (_xOutOfBoundsLength > 0 && newPoint.X - _initialSwipeX > _xOutOfBoundsLength))
)
213
www.it-ebooks.info
CHAPTER6GESTURES
{
}
}
SwipeOutOfBoundsDetected(this, new KinectCursorEventArgs(point)
{ Z = z, Cursor = _cursorAdorner });
MagneticSlide
ThemagneticslideistheholygrailofKinectgestures.ItwasdiscoveredbyUXdesignersatHarmonixas
theywerecreatingDanceCentral.Itwasoriginallyusedinamenusystembuthasbeenadoptedasa
buttonidiominseveralplacesincludingtheXboxdashboarditself.Itissubstantiallysuperiortothe
magnetbuttonbecauseitdoesnotrequiretheusertowaitforsomethingtohappen.InXboxgamesasin
life,noonewantstowaitaroundforthingstohappen.Thealternativepushbuttoncamewithitsown
baggage.Chiefoftheseisthatitisawkwardtouse.Themagneticslideislikethemagnetbuttontothe
extentthatonceauserenterstheareaofthebutton,thevisualcursorautomaticallylocksintoplace.At
thispoint,however,thingsdiverge.Insteadofhoveringoverthebuttoninordertohavesomething
happen,theuserswipesherhandtoactivatethebutton.
Programmatically,themagneticslideisbasicallyamashupofthemagnetbuttonandtheswipe
gesture.Tobuildthemagneticslidebutton,then,wesimplyhavetodeactivatethetimerinthehover
buttonaboveusintheinheritancetreeandhookintoaswipedetectionengineinstead.Listing6-30
illustratesthebasicstructureofthemagneticslidebutton.Theconstructordeactivatesthetimerinthe
baseclassforus.InitializeSwipeandDeinitializeSwipetakecareofhookinguptheswipedetection
functionalityinKinectCursorManager.
214
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-30.BasicMagneticSlideImplementation
public class MagneticSlide: MagnetButton
{
private bool _isLookingForSwipes;
public MagneticSlide()
{
base._timerEnabled = false;
}
private void InitializeSwipe()
{
if (_isLookingForSwipes)
return;
_isLookingForSwipes = true;
var kinectMgr = KinectCursorManager.Instance;
kinectMgr.GesturePointTrackingInitialize(SwipeLength, MaxDeviation
, MaxSwipeTime, XOutOfBoundsLength);
kinectMgr.SwipeDetected +=
new Input.KinectCursorEventHandler(kinectMgr_SwipeDetected);
kinectMgr.SwipeOutOfBoundsDetected +=
new Input.KinectCursorEventHandler(kinectMgr_SwipeOutOfBoundsDetected);
kinectMgr.GesturePointTrackingStart();
}
}
private void DeInitializeSwipe()
{
var kinectMgr = KinectCursorManager.Instance;
kinectMgr.SwipeDetected -= kinectMgr_SwipeDetected;
kinectMgr.SwipeOutOfBoundsDetected -= kinectMgr_SwipeOutOfBoundsDetected;
kinectMgr.GesturePointTrackingStop();
_isLookingForSwipes = false;
}
// . . .
Additionally,wewillwanttoexposetheparametersforinitializingswipedetectiononthecontrol
itselfsodeveloperscanadjustthebuttonfortheirparticularneeds.OfnoteinListing6-31isthewaywe
measuretheSwipeLengthandXOutOfBoundsLengthproperties.Bothhavenegativenumbersfortheir
defaultvalues.Thisisbecauseamagneticslideistypicallylocatedontherightsideofthescreen,
requiringuserstoswipeleft.Becauseofthis,thedetectionoffsetaswellastheoutofboundsoffsetfrom
thebuttonlocationisanegativeXcoordinatevalue.
215
www.it-ebooks.info
CHAPTER6GESTURES
Listing6-31.MagneticSlideProperties
public static readonly DependencyProperty SwipeLengthProperty =
DependencyProperty.Register("SwipeLength", typeof(double), typeof(MagneticSlide)
, new UIPropertyMetadata(-500d));
public double SwipeLength
{
get { return (double)GetValue(SwipeLengthProperty); }
set { SetValue(SwipeLengthProperty, value); }
}
public static readonly DependencyProperty MaxDeviationProperty =
DependencyProperty.Register("MaxDeviation", typeof(double), typeof(MagneticSlide),
new UIPropertyMetadata(100d));
public double MaxDeviation
{
get { return (double)GetValue(MaxDeviationProperty); }
set { SetValue(MaxDeviationProperty, value); }
}
public static readonly DependencyProperty XOutOfBoundsLengthProperty =
DependencyProperty.Register("XOutOfBoundsLength", typeof(double),
typeof(MagneticSlide)
, new UIPropertyMetadata(-700d));
public double XOutOfBoundsLength
{
get { return (double)GetValue(XOutOfBoundsLengthProperty); }
set { SetValue(XOutOfBoundsLengthProperty, value); }
}
public static readonly DependencyProperty MaxSwipeTimeProperty =
DependencyProperty.Register("MaxSwipeTime", typeof(int), typeof(MagneticSlide),
new UIPropertyMetadata(300));
public int MaxSwipeTime
{
get { return (int)GetValue(MaxSwipeTimeProperty); }
set { SetValue(MaxSwipeTimeProperty, value); }
}
Tocompleteourimplementationofthemagneticslide,wejustneedtohandlethebaseenterevent
aswellastheeventscapturedontheswipedetectionengine.Wewillnothandlethebaseleaveevent
becausewhenauserperformsaswipe,hewilllikelytriggertheleaveeventinadvertently.Wedonot
wanttodeactivateanyofthealgorithmswehaveinitiatedatthispoint,however.Instead,wewaitfor
216
www.it-ebooks.info
CHAPTER6GESTURES
eitherasuccessfulswipedetectionoraswipeoutofboundseventbeforeturningswipedetectionoff.
Whenaswipeisdetected,ofcourse,wethrowthestandardClickevent.
Listing6-32.MagneticSlideEventManagement
public static readonly RoutedEvent SwipeOutOfBoundsEvent
= EventManager.RegisterRoutedEvent("SwipeOutOfBounds", RoutingStrategy.Bubble,
typeof(KinectCursorEventHandler), typeof(KinectInput));
public event RoutedEventHandler SwipeOutOfBounds
{
add { AddHandler(SwipeOutOfBoundsEvent, value); }
remove { RemoveHandler(SwipeOutOfBoundsEvent, value); }
}
void kinectMgr_SwipeOutOfBoundsDetected(object sender, Input.KinectCursorEventArgs e)
{
DeInitializeSwipe();
RaiseEvent(new KinectCursorEventArgs(SwipeOutOfBoundsEvent));
}
void kinectMgr_SwipeDetected(object sender, Input.KinectCursorEventArgs e)
{
DeInitializeSwipe();
RaiseEvent(new RoutedEventArgs(ClickEvent));
}
protected override void OnKinectCursorEnter(object sender, Input.KinectCursorEventArgs e)
{
InitializeSwipe();
base.OnKinectCursorEnter(sender, e);
}
Evenasthisbookisbeingpreparedforpublication,Microsofthasreleasedanewversionofthe
Xboxdashboardthatincludesanewvariationonthemagneticslideidiom.Partoftheexcitement
aroundeachoftheseiterationsinKinectUXisthatdesignersarecomingupwiththingsneverseen
before.Foryears,theyhavebeenworkingaroundthesamesortsofcontrols,dayinanddayout.The
boundariesoflegitimateUXhadalreadybeenestablishedinwebanddesktopapplications.Kinect
providesaresetoftheserulesandoffersnewpossibilitiesaswellasnewchallengesfortheworldof
softwaredesign.
VerticalScroll
Notallcontentdisplaysperfectlywithintheconfinesofeveryscreen.Oftenthereismorecontentthan
screenrealestate,whichrequirestheusertoscrollthescreenorlistingcontroltorevealadditional
content.Traditionally,ithasbeentabootodesignaninterfacewithhorizontalscrolling;however,the
swipetouchgestureseemstocircumventthisconcernintouchinterfaces.BothXboxandSony
PlayStationsystemshaveusedverticalscrollingcarouselsformenus.Harmonix’sDanceCentralseries
alsousesaverticalscrollingmenusystem.DanceCentralwasthefirsttoshowhowsuccessfulthe
verticalscrollinginterfaceiswhenappliedtogesturalinterfaces.Inthegesturalinterfaceparadigm,
verticalscrollingoccurswhentheuserraisesorlowersherarmtocausescreencontenttoscroll
217
www.it-ebooks.info
CHAPTER6GESTURES
vertically;thearmisextendedawayfromthebody,asshowninFigure6-3.Raisingthearmcausesthe
screen,menu,orcarouseltoscrollfrombottomtotop,andloweringthearm,fromtoptobottom.
Figure6-3.Theverticalscroll
WhilethehorizontalswipeisseeminglymorecommoninKinectapplications,(itisthedominant
gestureinthenewMetro-styleXboxinterface),theverticalscrollisthemoreuserfriendlyandbetter
choiceforuserinterfaces.Theswipe,beithorizontalorvertical,suffersfromafewuserexperience
problems.Inaddition,itistechnicallydifficulttodetect,becausetheswipeformandmotionvaries
dramaticallyfrompersontoperson.Thesamepersonoftendoesnothaveaconstantswipemotion.The
swipemotionworksontouchinterfacesbecausetheactiondoesnotoccurunlesstheusermakes
contactwiththescreen.However,withagesturalinterface,theusermakes“contact”withthevisual
elementswhenhishandiswithinthesamecoordinatespaceasthevisualelement.
Whenauserswipes,hetypicallykeepshishandonthesamerelativehorizontalplanethroughout
thecourseofthemotion.Thiscreatesproblemswhenheintendstomakemultiplesuccessiveswipes.It
createsanawkwardeffectwheretheusercanaccidentlyundothepreviousswipe.Forexample,theuser
swipesfromrighttoleftwithhisrighthand.Thisadvancestheinterfacebyapage.Theuser’srighthand
isnowpositionedontheleftsideofhisbody.Theuserthenmoveshishandbacktotheoriginalstarting
pointforthepurposeofperforminganotherright-to-leftswipe.However,ifhekeepshishandonthe
samehorizontalplane,theapplicationdetectsaleft-to-rightswipeandmovestheinterfacebacktothe
previouspage.Theuserhastocreatealoopingmotioninordertoavoidtheunintendedgesture.
Further,frequentswipingcausesfatigueintheuser.Theseproblemsareonlyexacerbatedwhen
performedvertically.
Theverticalscrolldoesnothavethesesameuserexperiencefaults;itiseasiertouseandmore
intuitivefortheuser.Additionally,theuserdoesnotsufferfatiguefromthegestureandisaffordedmore
granularcontroloverthescrollingaction.Fromatechnicalstandpoint,theverticalscrolliseasierto
implementthantheswipe.Theverticalscrollistechnicallyaposeandnotagesture.Thescrollactionis
determinedbythestaticpositionoftheuser’sarmandnotbyamotion.Thedirectionandamountof
scrollisbasedontheangleoftheuser’sarm.Figure6-4illustratestheverticalscroll.
218
www.it-ebooks.info
CHAPTER6GESTURES
Figure6-4.Verticalscrollrangeofmotion
UsingtheposedetectioncodefromChapter5,wecancalculatetheanglecreatedfromthetorsoto
theuser’sshoulderandwrist.Defineananglerangeforaneutralzonewherenochangeoccurswhilethe
useriswithinthisrange.Whenauserextendsherarmawayfromherbody,similartothemotionshown
inFigure6-4,thearmnaturallyrestsata-5or355-degreeangle.Thisshouldbetheoffsetzeroofthe
verticalscrollrange.Arecommendedneutralzoneisplusorminus20degreesfromoffsetzero.Beyond
that,thenumberofincrementalzonesandthemagnitudeofincrementdependsontheapplication’s
requirements.However,itisadvisabletohaveatleasttwozonesaboveandbelowtheneutralzonefor
largeandsmallincrements.ThisgivestheuserthesamescrollinggranularityofatraditionalGUIvertical
scrollbar.
UniversalPause
Theuniversalpause,alsoknownastheguidegestureorescapegesture,isoneofthefewgesturesthat
Microsoftactuallyrecommendsandprovidesguidancearound.Thisgestureisaccomplishedbyposing
theleftarmata45-degreeangleoutfromthebody.ItisusedinavarietyofKinectgametitlestoeither
pauseactionorbringupanXboxmenu.Itispeculiaramongthegesturesdescribedinthisbookinthatit
doesnotappeartohaveanycommonlyknownsymbolicantecedentsandcanbeconsideredan
artificial,orevenadigitallyauthentic,gesture.Ontheplusside,auniversalpauseiseasytoperform,
doesnotstrainthearm,andisunlikelytobeconfusedwithanyothergesture.
Thetechnicalrequirementsforimplementingauniversalpausearesimilartothatusedforthe
verticalscroll.Sincethebasicalgorithmfordetectingauniversalpausehasalreadybeencovered
elsewhere,nocodewillbeprovidedforitinthischapter.Ileavetheimplementationofthisgestureupto
theingenuityofthereader.
219
www.it-ebooks.info
CHAPTER6GESTURES
TheFutureofGestures
Tripstothegrocerystoreareoftenunremarkable.Kinectwillsoonbecomejustasunremarkable.Kinect
willpassunremarkabilityandendupbeingforgottenbymostpeople.Itwillendupbeingatreasured
reliccollectedbygeeksandnerds—theseauthorsincluded.Itwillsimplydisappearandthecurrent
stageofhardwareandsoftwaretechnologyadvancesweareexperiencingwillrecedeintoinvisibility.Is
thiscrazytalk?Afterall,whowouldwastetheirtimewritingabookonatechnologytheypredictwill
vanishfromtheconsciousnessoftheworld?
TheerrorisnotinunderestimatingKinect’ssignificancebutratherinourassessmentofatriptothe
grocerystore.Whenenteringamoderngrocerystore,thedoorstothestoreautomaticallyopenasyou
approachthem.Thatisremarkable,onlysurpassedbythefactthatnoonenotices,cares,oriseven
awareofthefeature.SomedayKinectwillalsoblendintothefabricofeverydayexistenceanddisappear
fromourconsciousness,justasautomaticdoorsdo.
KinectandtheNUIworldhavejustbegun.WeareseveralyearsfromKinectdisappearing,butover
time,theexperiencewillchangedramatically.ThesceneinTheMinorityReportwithTomCruise
gesturing(flailinghisarmsaround)toopenandshuffledocumentsonascreenhasbecometheexample
ofthepotentialofKinect-drivenapplications,whichisquiteunfortunate.Sciencefictionhasamuch
betterimaginationthanthisandprovidesfarbettertechnologythatwhatwecanbringintoreality.
Instead,thinkStarTrek,StarWars,or2001:ASpaceOdyssey.Ineachoftheseworksofsciencefiction,
computerscanseeandsensetheuser.Ineachoftheseworksofsciencefiction,usersseamlesslyinteract
withcomputersusingtheirvoicesandtheirgestures.Assomeworksoffictionshow,thiscertainlycanbe
anegativeandrequiressomeboundaries.Todaywehavegrownaccustomtosecuritycamerasrecording
everythingwedo.Imaginehowthischangesoncecomputersbeginprocessingtheserecordingsinreal
time.
Whiletherearelegitimateconcernsabouttheprospectofsciencefictionbecomingreality,itis
comingallthesame.Focusingonthebeneficialaspectsisimportant.Kinecttechnologyfacilitatessmart
environments.Itallowsapplicationstobebuiltthatprocessauser’sgesturestoderiveintentwithout
beingexplicitlygivensuchinformation.UserstodayperformgesturesforKinect.Gamesand
applicationslookforspecificinteractionsperformedbytheuser.Theusermustactivelycommunicate
andissuecommandstotheapplication.However,thereismuchmoregoingonbetweentheuserand
theexperiencethatisnotcapturedandprocessed.Iftheapplicationcandetectothergestures,ormore
specifically,themoodoftheuser,itcantailortheexperiencetotheuser.Wehavetoprogressfarther
fromwherewearetodaytoreachthisfuture.Today,thedetectablegesturesaresimpleandwearejust
learninghowtobuilduserinterfaces.Wewilllikelyfindthatwithmostgestural-basedapplications,the
userinterfacedisappears,muchliketouchinputeliminatedthecursor.
Imaginegettinghomefromwork,walkingintoyourden,andsaying,“Computer,playmusic”.The
computerunderstandsthevoicecommandandbeginsplayingmusic.However,thecomputerisalso
abletodetectthatyouhadaharddayandneedsomethingtolightenthemood.Thecomputerselectsa
playlistofmusicaccordingly.Speechwilloverwhelminglybecometheprimaryformofissuing
commandstocomputersandgestureswillaugment.Inthisexample,thecomputerwoulddetectyour
moodbasedonyourbodylanguage.Inthisway,gesturesbecomeapassiveorcontextualformof
communicationtothecomputer.Thisinnowaydiminishesormakesgestureslessimportant.Itactually
increasestheimportanceofthegesture,butnotinadirectway.
Today,thereareproximitysensorsthatturnlightsonandoffwhenapersonentersaroom.Thisisa
dumpsysteminthatthereisnocontextprovided.Itispossible,usingKinecttechnology,todetectmore
informationfromtheuser’smovementsandadjustthelights.Forexample,ifitis2:00AMandyouare
gettingupforaglassofwater,thecomputermayraisethelightlevelslightly,makingitpossibletosee
butnotblindingyouwithaflashofbrightlights.However,onanothernightwhenyoureturnhomeat
2:00AMfromanightout,itcandetectthatyouareawakeandturnthelightsoncompletely.
220
www.it-ebooks.info
CHAPTER6GESTURES
Kinectisstillverynewandwearestilltryingtounderstandhowtogettothisfuturevision.Thefirst
yearofKinectwasfascinatingtoobserveandexperience.WheninitiallyreleasedfortheXbox,thegame
titlesavailablewerelimited.Themorepopulartitlesweresports-basedgames,whichdidlittlemore
thanreproduceWiititles.Eachofthesegamesfeaturedseveralbasicgestureslikerunning,jumping,
kicking,andswingingorthrowingobjects.AlloftheearlyKinectgamesfortheXboxalsofeaturedsimple
menusystemsthatusedcursorstofollowthehands.Thegamesemployedthesamebuttoninterfaces
discussedpreviouslyinthischapter.Theprecedentforuserinterfacesandgameswasestablishedwith
thisfirstsetofgames.
WhiletherehavebeendramaticUXadvances,thegesturesgamesandapplicationsdetecttodayare
stillsimpleandcrudecomparedtowhatwillbebuiltinthenextcoupleofyears.Wearestilllearning
howtodefineanddetectgestures.Consequently,thegestureshavetobesomewhatbrutish.Themore
pronouncedthewavingofthehandorswiping(flailing)ofthearms,theeasieritistodetect.Whenwe
candetectthesubtleaspectsofthegesture,theapplicationswillbecometrulyimmersive.
Soccergamestodayonlydetectthebasickickingmotion.Thegamecannotdetermineiftheuser
kickedwithhertoe,laces,instep,orheel.Eachofthesekicksaffectstheballdifferently,andthisshould
inturnbereflectedinthegame.Further,thegameshouldideallybeabletoapplyphysicsappropriately,
givingtheballarealisticacceleration,velocity,spin,anddirectionbasedontheuser’skickingmotion
andfootpositionwhencontactwasmadewiththevirtualball.
ThelimitationswehavetodayareinpartduetotheresolutionofKinect’scameras.Futureversions
oftheKinecthardwarewillincludebettercamerasthatwillprovidebetterdepthdatatodevelopers.
MicrosofthasalreadyreleasedsomeinformationaboutasecondversionoftheKinecthardware.This
invariablywillresultinmoreaccuratedetectionofauser’smovements,whichhastwosignificanteffects
onKinectdevelopment.Thefirstisthattheaccuracyofskeletonjointresolutionincreases,whichnot
onlyincreasestheaccuracyofgesturedetection,butofthetypesofdetectablegestures.Theotherresult
isthatitwillbepossibletoreportonadditionaljointssuchasfingersandnon-jointslikethelips,nose,
ears,andeyes.Itiscurrentlypossibletodetectthepoints,butthird-partyimageprocessingtoolsare
required.ItisnotnativetotheKinectSDK.
Beingabletotrackfingersanddetectfingergesturesobviouslyallowsforsignlanguage.Suddenly,
afteraddingfingertracking,thegesturecommandlibraryexplodeswithpossibilities.Userscaninteract
andmanipulatevirtualobjectswithgreaterlevelsofprecisionanddexterityinanaturalway.Finger
gesturescommunicateinformationthatismorecomplexandprovidesgreatercontexttowhattheuser
iscommunicating.Thenexttimeyouhaveaconversationwithsomeone,dosowithyourfingersballed
upinafist;immediately,thetoneoftheconversationwillchange.Thepartnerinyourconversationis
eithergoingtothinkyouarehostiletowardsthemorjustplainridiculous.Nodoubt,neitherofthesetwo
emotionswillbeinlinewithyourintendedtonefortheconversation.Itisineffectivetopointat
somethingwithonlyyourfist.Shakingyourfistatsomeoneandshakingyourfingeratthemaredifferent
gesturesthatcommunicatedifferentmessages.ThecurrentstateofskeletontrackingonKinectdoesnot
allowfordeveloperstodetectthesedistinctions.
Evenwiththeadditionoffingergesturedetection,theKinectexperiencedoesnotchangeina
revolutionarywayfromtoday’sexperiencewhereKinectishighlyprominentandintrusive.Theuser
mustbeawareofKinectandknowhowtointeractwiththehardware.WatchsomeoneplayaKinect
gameandnoticehowsheaddressesthedevice.Usersarestiffandoftenunnatural.Gesturesare
frequentlynotrecognizedandrequirerepeatingor,worse,theyareincorrectlydetected,resultingin
unintendedreactions.Furthermore,thegesturesausermakesoftenhavetobeexaggeratedinorderto
bedetected.Butthisisonlytemporary.
Inthefuture,thehardwareandsoftwarewillimproveanduserswillbecomemorecomfortableand
naturalwithgesturalinterfaces.Atthatpoint,Kinectwillbecomesoremarkablethatitisas
unremarkableasanautomaticdoor.
221
www.it-ebooks.info
CHAPTER6GESTURES
Summary
Thischapterservesasasnapshotintime—anoverviewofthestateoftheartingesturalinterfacesatthe
beginningof2012.Infiveyearsorso,asthegesturalinterfacemovesforwardandNUItheorymutates,
theideasputforthinthischapterwillnodoubtappearquaint.Iftheauthorsarefortunate,itwillnotyet
seemquaintoverthenextyear,however.
Hereweintroducedyoutothetheoryofgesturesandtheintellectualcollisionscreatedbythe
adventofatruegesturalinterface.Thechapteraddressedtheconceptsandlanguageofthenaturaluser
interfaceandhowtheyapplytoprogrammingforKinect.Withthisfoundationinplace,youwereguided
throughthecomplexitiesandpitfallsofactuallyimplementingthesegesturalidioms:thewave,the
hoverbutton,themagnetbutton,thepushbutton,themagneticslide,theuniversalpause,vertical
scrolling,andswiping.Thecodeprovidedinthissectionwill,wehope,offerinspirationandpractical
skillsforbuildingnewidiomsandexpandingthegesturalvocabulary.
222
www.it-ebooks.info
CHAPTER 7
Speech
ThemicrophonearrayisthehiddengemoftheKinectsensor.Thearrayismadeupoffourseparate
microphonesspreadoutlinearlyatthebottomoftheKinect.Bycomparingwheneachmicrophone
capturesthesameaudiosignal,themicrophonearraycanbeusedtodeterminethedirectionfrom
whichthesignaliscoming.Thistechniquecanalsobeusedtomakethemicrophonearraypaymore
attentiontosoundfromoneparticulardirectionratherthananother.Finally,algorithmscanbeapplied
totheaudiostreamscapturedfromthemicrophonearrayinordertoperformcomplexsound
dampeningeffectstoremoveirrelevantbackgroundnoise.Allofthissophisticatedinteractionbetween
KinecthardwareandKinectSDKsoftwareallowsspeechcommandstobeusedinalargeroomwhere
thespeaker’slipsaremorethanafewinchesfromthemicrophone.
WhenKinectwasfirstreleasedfortheXbox360,themicrophonearraytendedtogetoverlooked.
Thiswasdueinparttotheexcitementoverskeletontracking,whichseemedlikeamuchmore
innovativetechnology,butalsoinparttosloweffortstotakeadvantageoftheKinect’saudiocapabilities
ingamesorintheXboxdashboard.
ThefirstnotionIhadofhowimpressivethemicrophonearrayreallyisoccurredbyaccidentwhen
myson,anavidplayeroffirst-personshootersonXboxLive,brokehisheadset.Icamehomefromwork
onedaytofindhimusingtheKinectsensorasamicrophonetotalkin-gamewithhisteam.Hehad
somehowdiscoveredthathecouldsitcomfortablytenfeetawayfromthetelevisionandtheKinectwith
awirelessgamecontrollerandchatawaywithhisfriendsonline.TheKinectwasablenotonlytopickup
hisvoicebutalsotoeliminatebackgroundnoises,thesoundofhisvoiceandthevoiceofhisfriends
comingoveroursoundsystem,aswellasin-gamemusicandin-gameexplosions.Thiswasparticularly
strikingatthetimeasIhadjustcomehomefromacross-countryconferencecallusingarather
expensiveconference-calltelephoneandweconstantlyhadtoaskthevariousspeakerstorepeat
themselvesbecausewecouldn’thearwhattheyweresaying.
AsindependentdevelopershavestartedworkingwithKinecttechnology,ithasalsobecome
apparentthattheKinectmicrophonearrayfillsaparticulargapinKinectapplications.Whilethevisual
analysismadepossiblebytheKinectisimpressive,itisstillnotabletohandlefinemotorcontrol.Aswe
havemovedfromoneuserinterfaceparadigmtoanother–fromcommand-lineapplications,totabbed
applications,tothemouse-enabledgraphicaluserinterface,andtothetouch-enablednaturaluser
interface–eachinterfacehasalwaysprovidedaneasywaytoperformthebasicselectionaction.Itcan
evenbesaidthateachsubsequentuserinterfacetechnologyhasimprovedourabilitytoselectthings.
TheKinect,oddlyenough,breaksthistrend.
SelectionhasturnedouttobeoneofthemostcomplicatedactionstomasterwiththeKinect.The
initialselectionidiomintroducedontheXbox360involvedholdingone’shandsteadyoveragiven
locationforafewseconds.Asubsequentidiom,introducedinthegameDanceCentral,improvedon
thisbyrequiringashorterholdandthenaswipe–anidiomeventuallyadoptedfortheXboxdashboard.
Otherattemptsbyindependentdeveloperstosolvethisproblemhaveincludedgesturessuchasholding
anarmoverone’shead.
223
www.it-ebooks.info
CHAPTER7SPEECH
TheproblemofperformingaselectactionwiththeKinectcanbesolvedrelativelyeasilyby
combiningspeechrecognitioncommandswithskeletontrackingtocreateahybridgesture:holdand
speak.Menuscanbeimplementedevenmoreeasilybysimplyprovidingalistofmenucommandsand
allowingtheusertospeakthecommandshewantstoselect–muchastheXboxcurrentlydoesinthe
dashboardandinitsNetflixKinect-enabledapplication.Wecanexpecttoseemanyuniquehybrid
solutionsinthefutureasindependentdevelopersaswellasvideogamecompaniescontinueto
experimentwithnewidiomsforinteractionratherthansimplytrytoreimplementpoint-and-click.
MicrophoneArrayBasics
WhenyouinstalltheMicrosoftKinectSDK,thecomponentsrequiredforspeechrecognitionare
automaticallychaininstalled.TheKinectmicrophonearrayworksontopofpreexistingcodelibraries
thathavebeenaroundsinceWindowsVista.ThesepreexistingcomponentsincludetheVoiceCapture
DirectXMediaObject(DMO)andtheSpeechRecognitionAPI(SAPI).
InC#,theKinectSDKprovidesawrapperthatextendstheVoiceCaptureDMO.TheVoiceCapture
DMOisintendedtoprovideanAPIforworkingwithmicrophonearraystoprovidefunctionalitysuchas
acousticechocancellation(AEC),automaticgaincontrol(AGC),andnoisesuppression.This
functionalitycanbefoundintheaudioclassesoftheSDK.TheKinectSDKaudiowrappersimplifies
workingwiththeVoiceCaptureDMOaswellasoptimizingDMOperformancewiththeKinectsensor.
ToimplementspeechrecognitionwiththeKinectSDK,thefollowingautomaticallyinstalled
librariesarerequired:theSpeechPlatformAPI,theSpeechPlatformSDK,andtheKinectforWindows
RuntimeLanguagePack.
TheSpeechRecognitionAPIissimplythedevelopmentlibrarythatallowsyoutodevelopagainst
thebuilt-inspeechrecognitioncapabilitiesoftheoperatingsystem.Itcanbeusedwithorwithoutthe
KinectSDK,forinstanceifyouwanttoaddspeechcommandstoastandarddesktopapplicationthat
usesamicrophoneotherthantheKinectmicrophonearray.
TheKinectforWindowsRuntimeLanguagePack,ontheotherhand,isaspecialsetoflinguistic
modelsusedforinteroperabilitybetweentheKinectSDKandSAPIcomponents.JustasKinectskeleton
recognitionrequiredmassivecomputationalmodelingtoprovidedecisiontreestointerpretjoint
positions,theSAPIlibraryrequirescomplexmodelingtoaidintheinterpretationoflanguagepatternsas
theyarereceivedbytheKinectmicrophonearray.TheKinectLanguagePackprovidesthesemodelsto
optimizetherecognitionofspeechcommands.
MSRKinectAudio
ThemainclassforworkingwithaudioisKinectAudioSource.ThepurposeoftheKinectAudioSourceclass
istostreameitherrawormodifiedaudiofromthemicrophonearray.Theaudiostreamcanbemodified
toincludeavarietyofalgorithmstoimproveitsqualityincludingnoisesuppression,automaticgain
control,andacousticechocancellation.KinectAudioSourcecanbeusedtoconfigurethemicrophone
arraytoworkindifferentmodes.Itcanalsobeusedtodetectthedirectionfromwhichaudioisprimarily
comingaswellastoforcethemicrophonearraytopointinagivendirection.
Throughoutthischapter,IwillattempttoshieldyouasmuchasIcanfromalow-level
understandingofthetechnicalaspectsofaudioprocessing.Nevertheless,inordertoworkwiththe
KinectAudioSource,itishelpfulatleasttobecomefamiliarwithsomeofthevocabularyusedinaudio
recordingandaudiotransmission.Pleaseusethefollowingglossaryasahandyreferencetotheconcepts
abstractedbytheKinectAudioSource.
224
www.it-ebooks.info
CHAPTER7SPEECH
•
AcousticEchoCancellation(AEC)referstoatechniquefordealingwithacoustic
echoes.Acousticechoesoccurwhensoundfromaspeakerissentbackovera
microphone.Acommonwaytounderstandthisistothinkofwhathappenswhen
oneisonthetelephoneandhearsone’sownspeech,withacertainamountof
delay,repeatedoverthereceiver.Acousticechocancellationdealswiththisby
subtractingsoundpatternscomingoveraspeakerfromthesoundpickedupby
themicrophone.
•
AcousticEchoSuppression(AES)referstoalgorithmsusedtofurthereliminate
anyresidualecholeftoverafterAEChasoccurred.
•
AutomaticGainControl(AGC)pertainstoalgorithmsusedtomaketheamplitude
ofthespeaker’svoiceconsistentovertime.Asaspeakerapproachesormoves
awayfromthemicrophone,hervoicemayappeartobecomelouderorsofter.AGC
attemptstoevenoutthesechanges.
•
BeamFormingreferstoalgorithmictechniquesthatemulateadirectional
microphone.Ratherthanhavingasinglemicrophoneonamotorthatcanbe
turned,beamformingisusedinconjunctionwithamicrophonearray(suchasthe
oneprovidedwiththeKinectsensor)toachievethesameresultsusingmultiple
stationarymicrophones.
•
CenterClippingisaprocessthatremovessmallechoresidualsthatremainafter
AECprocessinginone-waycommunicationscenarios.
•
FrameSize-TheAECalgorithmprocessesPCMaudiosamplesoneframeata
time.Theframesizeisthesizeoftheaudioframemeasuredinsamples.
•
GainBoundingensuresthatthemicrophonehasthecorrectlevelofgain.Ifgainis
toohigh,thecapturedsignalmightbesaturatedandwillbeclipped.Clippingisa
non-lineareffect,whichwillcausetheacousticechocancellation(AEC)algorithm
tofail.Ifthegainistoolow,thesignal-to-noiseratioislow,whichcanalsocause
theAECalgorithmtofailornotperformwell.
•
NoiseFillingaddsasmallamountofnoisetoportionsofthesignalwherecenter
clippinghasremovedtheresidualechoes.Thisresultsinabetterexperiencefor
theuserthanleavingsilentgapsinthesignal.
•
NoiseSuppression(NS)isusedtoremovenon-speechsoundpatternsfromthe
audiosignalreceivedbyamicrophone.Byremovingthisbackgroundnoise,the
actualspeechpickedupbythemicrophonecanbemadecleanerandclearer.
•
Optibeam-theKinectsensorsupportselevenbeamsfromitsfourmicrophones.
Theseelevenbeamsshouldbethoughtofaslogicalstructureswhereasthefour
channelsarephysicalstructures.Optibeamisasystemmodethatperforms
beamforming.
•
Signal-to-NoiseRatio(SNR)isameasureofthepowerofaspeechsignaltothe
overallpowerofbackgroundnoise.Thehigherthebetter.
•
SingleChannel–TheKinectsensorhasfourmicrophonesandconsequently
supportsfourchannels.Singlechannelisasystemmodesettingthatturnsoff
beamforming.
225
www.it-ebooks.info
CHAPTER7SPEECH
TheKinectAudioSourceclassoffersahighlevelofcontrolovermanyaspectsofaudiorecording,
thoughitcurrentlydoesnotexposeallaspectsoftheunderlyingDMO.Thevariouspropertiesusedto
tweakaudioprocessingwiththeKinectAudioSourceareknownasfeatures.Table7-1liststhefeature
propertiesthatcanbeadjusted.EarlybetaversionsoftheKinectforWindowsSDKtriedtocloselymatch
theAPIoftheunderlyingDMO,whichprovidedagreaterlevelofcontrolbutalsoexposedaremarkable
levelofcomplexity.ThereleaseversionoftheSDKdistillsallthepossibleconfigurationsoftheDMO
intoitsessentialfeaturesandquietlytakescareoftheunderlyingconfigurationdetails.Foranyonewho
haseverhadtoworkwiththoseunderlyingconfigurationproperties,thiswillcomeasagreatrelief.
Table7-1.KinectAudioFeatureProperties
Name
Values / Default
What does it do?
AutomaticGainControlEnabled
True,False
Default:False
SpecifieswhethertheDMO
performsautomaticgaincontrol
BeamAngleMode
Adaptive
Automatic
Manual
Default:Automatic
CancellationAndSuppression
CancellationOnly
None
Default:None
Specifiesthealgorithmsusedto
performmicrophonearray
processing
True,False
Default:True
SpecifieswhethertheDMO
performsnoisesuppression
EchoCancellationMode
NoiseSuppression
TurnsAEConandoff
TheEchoCancellationModeisoneofthosemiraculoustechnicalfeatshiddenbehindanunassuming
name.ThepossiblesettingsarelistedinTable7-2.InordertouseAEC,youwillneedtodiscoverand
provideanintegervaluetotheEchoCancellationSpeakerIndexpropertyindicatingthespeakernoisethat
willneedtobemodified.TheSDKautomaticallyperformsdiscoveryfortheactivemicrophone.
Table7-2.EchoCancellationModeEnumeration
Echo Cancellation Mode
What does it do?
CancellationAndSuppression
Acousticechocancellationaswellasadditionalacousticecho
suppression(AES)onresidualsignal
CancellationOnly
AcousticechoCancellationonly
None
AECisturnedoff
226
www.it-ebooks.info
CHAPTER7SPEECH
BeamAngleMode abstractsouttheunderlyingDMOSystemModeandMicrophoneArrayMode
properties.AttheDMOlevelitdetermineswhethertheDMOshouldtakecareofbeamformingorallow
theapplicationtodothis.Ontopofthis,theKinectforWindowsSDKprovidesanadditionalsetof
algorithmsforperformingbeamforming.Ingeneral,IprefertousetheAdaptivesettingfor
beamforming,shellingoutthecomplexworktotheSDK.Table7-3explainswhateachofthe
BeamAngleModesettingsdoes.
Table7-3.BeamAngleModeEnumeration
Beam Angle Mode
What does it do?
Adaptive
BeamformingiscontrolledbyalgorithmscreatedforKinect
Automatic
BeamformingiscontrolledbytheDMO
Manual
Beamformingiscontrolledbytheapplication
AdaptivebeamformingwilltakeadvantageofthepeculiarcharacteristicsoftheKinectsensortofind
thecorrectsoundsourcemuchinthewaytheskeletontrackertriestofindthecorrectpersontotrack.
Liketheskeletontracker,Kinect’sbeamformingfeaturecanalsobeputintomanualmode,allowingthe
applicationtodeterminethedirectionitwantstoconcentrateonforsound.TousetheKinectsensoras
adirectionalmicrophone,youwillwanttosetthebeamanglemodetoManual andprovideavalueto
theKinectAudioSource’sManualBeamAngleproperty.
NoteTherearesomerestrictionstothewayfeaturescanbecombined.Automaticgaincontrolshouldbe
deactivatedifAECisenabled.Similarly,AGCshouldbedeactivatedifspeechrecognitionwillbeused.
SpeechRecognition
Speechrecognitionisbrokendownintotwodifferentcategories:recognitionofcommandsand
recognitionoffree-formdictation.Free-formdictationrequiresthatonetrainsoftwaretorecognizea
particularvoiceinordertoimproveaccuracy.Thisisdonebyhavingspeakersrepeataseriesofscripts
outloudsothesoftwarecomestorecognizethespeaker’sparticularvocalpatterns.
Commandrecognition(alsocalledCommandandControl)appliesanotherstrategytoimprove
accuracy.Ratherthanattempttorecognizeanythingaspeakermightsay,commandrecognition
constrainsthevocabularythatitexpectsanygivenspeakertovocalize.Basedonalimitedsetof
expectations,commandrecognitionisabletoformulatehypothesesaboutwhataspeakeristryingtosay
withouthavingtobefamiliarwiththespeakeraheadoftime.
GiventhenatureofKinect,open-endeddictationdoesnotmakesensewiththetechnology.The
KinectSDKisforfreestandingapplicationsthatanyonecanwalkuptoandbeginusing.Consequently,
theSDKprimarilysupportscommandrecognitionthroughtheMicrosoft.Speechlibrary,whichisthe
227
www.it-ebooks.info
CHAPTER7SPEECH
serverversionoftheMicrosoftspeechrecognitiontechnology.Ontheotherhand,ifyoureallywantto,
thespeechcapabilitiesoftheSystem.Speechlibrary,thedesktopversionofMicrosoft’sspeech
recognitiontechnologybuiltintoWindowsoperatingsystems,canbereferencedandusedtobuilda
dictationprogramusingtheKinectmicrophone.TheresultsofcombiningKinectwiththe
System.Speechlibraryforfreedictationwillnotbegreat,however.ThisisbecausetheKinectfor
WindowsRuntimeLanguagePack,thelinguisticmodelsadaptedtovocalizationsfromanopenspace
ratherthanasourceinchesfromthemicrophone,cannotbeusedwithSystem.Speech.
CommandrecognitionwiththeMicrosoft.Speechlibraryisbuiltaroundthe
SpeechRecognitionEngine.TheSpeechRecognitionEngineclassistheworkhorseofspeechrecognition,
takinginaprocessedaudiostreamfromtheKinectsensorandthenattemptingtoparseandinterpret
vocalutterancesascommandsitrecognizes.Theengineweighstheelementsofthevocalizationand,if
itdecidesthatthevocalizationcontainselementsitrecognizes,passesitontoaneventforprocessing.If
itdecidesthecommandisnotrecognized,itthrowsthatpartoftheaudiostreamaway.
WetelltheSpeechRecognitionEnginewhattolookforthroughconstructscalledgrammars.A
Grammarobjectcanbemadeupofsinglewordsorstringsofwords.Grammarobjectscanincludewildcards
iftherearepartsofaphrasewhosevaluewedonotcareabout;forinstance,wemaynotcareifa
commandincludesthephrase“an”appleor“the”apple.Awildcardinourgrammartellstherecognition
enginethateitherisacceptable.Additionally,wecanaddaclasscalledChoicestoourgrammar.A
ChoicesclassislikeaWildcardclassinthatitcancontainmultiplevalues.Unlikeawildcard,however,
wespecifythesequenceofvaluesthatwillbeacceptableinourchoices.
Forexample,ifwewantedtorecognizethephrase“Givemesomefruit”wherewedonotcarewhat
thearticlebeforefruitis,butwanttobeabletoreplacefruitwithadditionalvaluessuchasapple,
orange,orbanana,wewouldbuildagrammarsuchastheoneinListing7-1.TheMicrosoft.Speech
libraryalsoprovidesaGrammarBuilderclasstohelpusbuildourgrammars.
Listing7-1.ASampleGrammar
var choices = new Choices;
choices.Add("fruit");
choices.Add("apple");
choices.Add("orange");
choices.Add("banana");
var grammarBuilder = new GrammarBuilder();
grammarBuilder.Append("give");
grammarBuilder.Append("me");
grammarBuilder.AppendWildcard();
grammarBuilder.Append(choices);
var grammar = new Grammar(grammarBuilder);
NoteGrammarsarenotcasesensitive.Itisgoodpractice,however,tobeconsistentanduseeitherallcapsor
alllowercasecharactersinyourcode.
Grammarsareloadedintothespeechrecognitionengineusingtheengine’sLoadGrammarmethod.
Thespeechrecognitionenginecan,andoftendoes,loadmultiplegrammars.Theenginehasthree
eventsthatshouldbehandled:SpeechHypothesized,SpeechRecognized,andSpeechRecognitionRejected.
228
www.it-ebooks.info
CHAPTER7SPEECH
SpeechHypothesizediswhattherecognitionengineinterpretsthespeakertobesayingbeforedecidingto
acceptorrejecttheutteranceasacommand.SpeechRecognitionRejectedishandledinordertodo
somethingwithfailedcommands.SpeechRecognizedis,byfar,themostimportantevent,though.When
thespeechrecognitionenginedecidesthatavocalizationisacceptable,itpassesittotheeventhandler
forSpeechRecognizedwiththeSpeechRecognizedEventArgsparameter.SpeechRecognizedEventArgshasa
ResultspropertydescribedinTable7-4.
Table7-4.SpeechResultProperties
Result Property Name
What does it do?
Audio
Providesinformationabouttheoriginalaudiofragmentthatisbeing
interpretedasacommand.
Confidence
Afloatvaluebetween0and1indicatingtheconfidencewithwhichthe
recognitionengineacceptsthecommand.
Text
Thecommandorcommandphraseasstringtext.
Words
AseriesofRecognizedWordUnitsbreakingupthecommandintoconstituent
parts.
InstantiatingaSpeechRecognitionEngineobjectfortheKinectrequiresaveryparticularsetofsteps.
First,aspecificstringindicatingtheIDoftherecognizerthatwillbeusedwiththespeechrecognition
enginemustbeassigned.WhenyouinstalltheserverversionofMicrosoft’sspeechlibraries,arecognizer
calledtheMicrosoftLightweightSpeechRecognizerwithanIDvalueofSR_MS_ZXX_Lightweight_v10.0
isinstalled(theIDmaybedifferentdependingontheversionofthespeechlibrariesyouinstall).After
installingtheKinectforWindowsRuntimeLanguagePack,asecondrecognizercalledtheMicrosoft
ServerSpeechRecognitionLanguage-Kinect(en-US)becomesavailable.Itisthissecondrecognizer
thatwewanttousewiththeKinect.Next,thisstringmustbeusedtoloadthecorrectrecognizerintothe
SpeechRecognitionEngine.SincetheIDofthissecondrecognizermaychangeinthefuture,weuse
patternmatchingtofindtherecognizerwewant.Finally,thespeechrecognitionenginemustbe
configuredtoreceivetheaudiostreamcomingfromtheKinectAudioSourceobjectdescribedinthe
previoussection.Fortunatelythereisboilerplatecodeforperformingthesestepsasillustratedin
Listing7-2.
229
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-2.ConfiguringtheSpeechRecognitionEngineobject
var source = new KinectAudioSource();
Func<RecognizerInfo, bool> matchingFunc = r =>
{
string value;
r.AdditionalInfo.TryGetValue("Kinect", out value);
return "True".Equals(value, StringComparison.InvariantCultureIgnoreCase)
&& "en-US".Equals(r.Culture.Name
, StringComparison.InvariantCultureIgnoreCase);
};
RecognizerInfo ri = SpeechRecognitionEngine.InstalledRecognizers().Where(matchingFunc).FirstOr
Default();
var sre = new SpeechRecognitionEngine(ri.Id);
KinectSensor.KinectSensors[0].Start();
Stream s = source.Start();
sre.SetInputToAudioStream(s,
new SpeechAudioFormatInfo(
EncodingFormat.Pcm, 16000, 16, 1,
32000, 2, null));
sre.Recognize();
ThesecondparameteroftheSetInputToAudioStreammethodindicateshowtheaudiofromthe
Kinectisformatted.IntheboilerplatecodeinListing7-2,weindicatethattheencodingformatisPulse
CodeModulation,thatwearereceiving16,000samplespersecond,thatthereare16bitspersample,
thatthereisonechannel,32,000averagebytespersecond,andtheblockalignvalueistwo.Ifnoneof
thismakessensetoyou,donotworry–that’swhatboilerplatecodeisfor.
Oncegrammarshavebeenloadedintothespeechrecognitionengine,theenginemustbestarted.
Therearemultiplewaystodothis.Therecognitionenginecanbestartedineithersynchronousor
asynchronousmode.Additionally,itcanbesetuptoperformrecognitiononlyonce,ortocontinue
recognizingmultiplecommandsastheyarereceivedfromtheKinectAudioSource.Table7-5showsthe
optionsforcommencingspeechrecognition.
Table7-5.SpeechRecognitionOverloadedMethods
SR Engine Start Syntax
What does it do?
Recognize()
Startsthespeechrecognitionenginesynchronouslyand
performsasingleoperation.
RecognizeAsync(RecognizeMode.Single)
Startsthespeechrecognitionengineasynchronouslyand
performsasingleoperation.
RecognizeAsync(RecognizeMode.Multiple)
Startsthespeechrecognitionengineasynchronouslyand
performsmultipleoperations.
230
www.it-ebooks.info
CHAPTER7SPEECH
Inthefollowingsections,wewillwalkthroughsomesampleapplicationsinordertoillustratehow
tousetheKinectAudioSourceclassandtheSpeechRecognitionEngineclasseffectively.
AudioCapture
WhiletheKinectAudioSourceclassisintendedprimarilyasaconduitforstreamingaudiodatatothe
SpeechRecognitionEngine,itcaninfactalsobeusedforotherpurposes.Asimplealternativeuse–and
onethataidsinillustratingthemanyfeaturesoftheKinectAudioSourceclass–isasasourcefor
recordingwavfiles.Thefollowingsampleprojectwillgetyouupandrunningwithaprimitiveaudio
recorder.Youwillthenbeabletousethisaudiorecordertoseehowmodifyingthedefaultvaluesthe
variousfeaturesoftheKinectSDKaffectstheaudiostreamthatisproduced.
WorkingwiththeSoundStream
EventhoughyouwillbeplayingwiththeKinect’saudioclassesinthischapterratherthanthevisual
classes,youbeginbuildingprojectsforKinectaudioinmuchthesameway.
1.
CreateanewWPFApplicationprojectcalledAudioRecorder.
2.
AddareferencetoMicrosoft.Research.Kinect.dll.
3.
AddthreebuttonstoyourMainWindowforPlay,Record,andStop.
4.
SettheTitlepropertyofthemainwindowto“AudioRecorder”.
YourscreenshouldlooksomethinglikeFigure7-1whenyourVisualStudioIDEisindesignmode.
Figure7-1.TheRecorderwindow
Frustratingly,thereisnonativewaytowritewavfilesinC#.Toaidusinwritingsuchfiles,wewill
usethefollowingcustomRecorderHelperclass.TheclassneedstouseastructcalledWAVFORMATEX,
basicallyatransliterationofaC++object,inordertofacilitatetheprocessingofaudiodata.Wewillalso
addapropertytoRecorderHelpercalledIsRecordingallowingustostoptherecordingprocesswhenwe
wantto.Thebasicstructureoftheclass,theWAVFORMATEXstruct,andthepropertyareoutlinedinListing
7-3.Wewillalsoinitializeaprivatebytearraycalledbufferthatwillbeusedtochunktheaudiostream
wereceivefromKinect.
231
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-3.RecorderHelper.cs
sealed class RecorderHelper
{
static byte[] buffer = new byte[4096];
static bool _isRecording;
public static bool IsRecording
{
get {return _isRecording; }
set{_isRecording = value; }
}
struct WAVEFORMATEX
{
public ushort wFormatTag;
public ushort nChannels;
public uint nSamplesPerSec;
public uint nAvgBytesPerSec;
public ushort nBlockAlign;
public ushort wBitsPerSample;
public ushort cbSize;
}
}
// ...
Tocompletethehelperclass,wewilladdthreemethods:WriteString,WriteWavHeader,and
WriteWavFile.WriteWavFile,seenbelowinListing7-4,takesaKinectAudioSourceobject,fromwhichwe
readaudiodata,andaFileStreamobjecttowhichwewritethedata.Itbeginsbywritingafakeheader
file,readsthroughtheKinectaudiostream,andchunksittotheFileStreamobjectuntilitistoldtostop
byhavingthe_isRecordingpropertysettofalse.Itthenchecksthesizeofthestreamthathasbeen
writtentothefileandusesthattoencodethecorrectfileheader.
Listing7-4.WritingtotheWavFile
public static void WriteWavFile(KinectAudioSource source, FileStream fileStream)
{
var size = 0;
//write wav header placeholder
WriteWavHeader(fileStream, size);
using (var audioStream = source.Start())
{
//chunk audio stream to file
while (audioStream.Read(buffer, 0, buffer.Length) > 0 && _isRecording)
{
fileStream.Write(buffer, 0, buffer.Length);
size += buffer.Length;
}
}
232
www.it-ebooks.info
CHAPTER7SPEECH
}
//write real wav header
long prePosition = fileStream.Position;
fileStream.Seek(0, SeekOrigin.Begin);
WriteWavHeader(fileStream, size);
fileStream.Seek(prePosition, SeekOrigin.Begin);
fileStream.Flush();
public static void WriteWavHeader(Stream stream, int dataLength)
{
using (MemoryStream memStream = new MemoryStream(64))
{
int cbFormat = 18;
WAVEFORMATEX format = new WAVEFORMATEX()
{
wFormatTag = 1,
nChannels = 1,
nSamplesPerSec = 16000,
nAvgBytesPerSec = 32000,
nBlockAlign = 2,
wBitsPerSample = 16,
cbSize = 0
};
using (var bw = new BinaryWriter(memStream))
{
WriteString(memStream, "RIFF");
bw.Write(dataLength + cbFormat + 4);
WriteString(memStream, "WAVE");
WriteString(memStream, "fmt ");
bw.Write(cbFormat);
bw.Write(format.wFormatTag);
bw.Write(format.nChannels);
bw.Write(format.nSamplesPerSec);
bw.Write(format.nAvgBytesPerSec);
bw.Write(format.nBlockAlign);
bw.Write(format.wBitsPerSample);
bw.Write(format.cbSize);
}
}
WriteString(memStream, "data");
bw.Write(dataLength);
memStream.WriteTo(stream);
}
static void WriteString(Stream stream, string s)
{
byte[] bytes = Encoding.ASCII.GetBytes(s);
stream.Write(bytes, 0, bytes.Length);
}
233
www.it-ebooks.info
CHAPTER7SPEECH
Withthehelperwritten,wecanbeginsettingupandconfiguringtheKinectAudioSourceobjectin
MainWindow.cs.WeaddaprivateBooleancalled_isPlayingtohelpkeeptrackofwhetherweare
attemptingtoplaybackthewavfileatanypointintime.Thishelpsustoavoidhavingourrecordand
playfunctionalityoccursimultaneously.WealsocreateaprivatevariablefortheMediaPlayerobjectwe
willusetoplaybackthewavfileswerecord,aswellasa_recordingFileNameprivatevariabletokeep
trackofthenameofthemostrecentlyrecordedfile.InListing7-6,wealsocreateseveralpropertiesto
enableanddisablebuttonswhenweneedto:IsPlaying,IsRecording,IsPlayingEnabled,
IsRecordingEnabled,andIsStopEnabled.Tomakethesepropertiesbindable,wemaketheMainWindow
classimplementINotifyPropertyChanged,addaNotifyPropertyChangedeventandan
OnNotifyPropertyChangedhelpermethod.
Listing7-5.TrackingRecordingState
public partial class MainWindow : Window, INotifyPropertyChanged
{
string _recordingFileName;
MediaPlayer _mplayer;
bool _isPlaying;
public event PropertyChangedEventHandler PropertyChanged;
private void OnPropertyChanged(string propName)
{
if (PropertyChanged != null)
PropertyChanged(this, new PropertyChangedEventArgs(propName));
}
public bool IsPlayingEnabled
{
get { return !IsRecording; }
}
public bool IsRecordingEnabled
{
get { return !IsPlaying && !IsRecording; }
}
public bool IsStopEnabled
{
get { return IsRecording; }
}
private bool IsPlaying
{
get{return _isPlaying;}
set
{
if (_isPlaying != value)
{
_isPlaying = value;
OnPropertyChanged("IsRecordingEnabled");
234
www.it-ebooks.info
CHAPTER7SPEECH
}
}
}
private bool IsRecording
{
get{return RecorderHelper.IsRecording;}
set
{
if (RecorderHelper.IsRecording != value)
{
RecorderHelper.IsRecording = value;
OnPropertyChanged("IsPlayingEnabled");
OnPropertyChanged("IsRecordingEnabled");
OnPropertyChanged("IsStopEnabled");
}
}
}
// ...
Thelogicofthevariouspropertiesmayseemabithairyatfirstglance.Wearesettingthe
IsPlayingEnabledpropertybycheckingtoseeifIsRecordingisfalse,andsettingthe
IsRecordingEnabledpropertybycheckingtoseeifIsPlayingisfalse.You’llhavetotrustmethatthis
workswhenwebindthisintheUIasillustratedinListing7-7.TheXAMLforthebuttonsintheUIshould
looklikethis,thoughyoumaywanttoplaywiththemarginsinordertolineupthebuttonsproperly:
Listing7-6.RecordandPlaybackButtons
<StackPanel Orientation="Horizontal">
<Button Content="Play" Click="Play_Click" IsEnabled="{Binding IsPlayingEnabled}"
FontSize="18" Height="44" Width="110" VerticalAlignment="Top" Margin="5"/>
<Button Content="Record" Click="Record_Click" IsEnabled="{Binding IsRecordingEnabled}"
FontSize="18" Height="44" Width="110" VerticalAlignment="Top" Margin="5"/>
<Button Content="Stop" Click="Stop_Click" IsEnabled="{Binding IsStopEnabled}"
FontSize="18" Height="44" Width="110" VerticalAlignment="Top" Margin="5"/>
</StackPanel>
IntheMainWindowconstructor,illustratedinListing7-8,weassignanewMediaPlayerobjecttothe
_mediaPlayervariable.Becausethemediaplayerspinsupitsownthreadinternally,weneedtocapture
themomentwhenitfinishesinordertoresetallofourbuttonstates.Additionally,weuseaveryoldWPF
tricktoenableourMainWindowtobindtotheIsPlayingEnabledandproperties.WesetMainPage’s
DataContexttoitself.Thisisashortcutthatimprovesourcode’sreadability,thoughtypicallythebest
practiceistoplacebindablepropertiesintotheirownseparateclasses.
235
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-7.Self-BindingExample
public MainWindow()
{
InitializeComponent();
this.Loaded += delegate{KinectSensor.KinectSensors[0].Start();};
_mplayer = new MediaPlayer();
_mplayer.MediaEnded += delegate{ _mplayer.Close(); IsPlaying = false; };
this.DataContext = this;
}
WearenowreadytoinstantiatetheKinectAudioSourceclassandpassittotheRecorderHelperclass
wecreatedearlier,asillustratedinListing7-8.Asanaddedprecaution,wewillmakethe
RecordKinectAudiomethodthreadsafebyplacinglocksaroundthebodyofthemethod.Atthe
beginningofthelockwesetIsRunningtotrue,andwhenitendswesettheIsRunningpropertybackto
false.
Listing7-8.InstantiatingandConfiguringKinectAudioSource
private KinectAudioSource CreateAudioSource()
{
var source = KinectSensor.KinectSensors[0].AudioSource;
source.NoiseSuppression = _isNoiseSuppressionOn;
source.AutomaticGainControlEnabled = _isAutomaticGainOn;
if (IsAECOn)
{
source.EchoCancellationMode = EchoCancellationMode.CancellationOnly;
source.AutomaticGainControlEnabled = false;
IsAutomaticGainOn = false;
source.EchoCancellationSpeakerIndex = 0;
}
}
return source;
private object lockObj = new object();
private void RecordKinectAudio()
{
lock (lockObj)
{
IsRecording = true;
using (var source = CreateAudioSource())
{
var time = DateTime.Now.ToString("hhmmss");
_recordingFileName = time + ".wav";
using (var fileStream =
new FileStream(_recordingFileName, FileMode.Create ))
{
236
www.it-ebooks.info
CHAPTER7SPEECH
}
}
RecorderHelper.WriteWavFile(source, fileStream);
}
IsRecording = false;
}
Asadditionalinsuranceagainsttryingtowriteagainstafilebeforethepreviousprocesshasfinished
writingagainstit,wealsocreateanewwavfilenamebasedonthecurrenttimeoneachinstancethat
thiscodeisiterated.
Thefinalstepissimplytoglueourbuttonstothemethodsforrecordingandplayingbackfiles.The
UIbuttonscallmethodssuchasPlay_ClickandRecord_Clickwhicheachhavethepropereventhandler
signatures.TheseinturnjustshellthecalltoouractualPlayandRecordmethods.Youwillnoticein
Listing7-10thattheRecordmethodbringstogetherourLaunchNewThreadmethodandour
RecordKinectAudiomethodsinordertospintheKinectAudioSourceobjectoffonitsownthread.
Listing7-9.RecordandPlaybackMethods
private void Record()
{
Thread thread = new Thread(new ThreadStart(RecordKinectAudio));
thread.Priority = ThreadPriority.Highest;
thread.Start();
}
private void Stop()
{
IsRecording = false;
KinectSensor.KinectSensors[0].AudioSource.Stop();
}
private void Play_Click(object sender, RoutedEventArgs e)
{
Play();
}
private void Record_Click(object sender, RoutedEventArgs e)
{
Record();
}
private void Stop_Click(object sender, RoutedEventArgs e)
{
Stop();
}
YoucannowuseKinecttorecordaudiofiles.MakesuretheKinect’sUSBcordispluggedintoyour
computerandthatitspowercordispluggedintoapowersource.TheKinectgreenLEDlightshould
begintoblinksteadily.RuntheapplicationandpresstheRecordbutton.Walkaroundtheroomtosee
howwelltheKinectsensorisabletopickupyourvoicefromdifferentdistances.Themicrophonearray
hasbeenconfiguredtouseadaptivebeamformingintheCreateAudioSourcemethod,soitshouldfollow
237
www.it-ebooks.info
CHAPTER7SPEECH
youaroundtheroomasyouspeak.PresstheStopbuttontoendtherecording.Whentheapplicationhas
finishedwritingtothewavfile,thePlaybuttonwillbeenabled.PressPlaytoseewhattheKinectsensor
pickedup.
CleaningUptheSound
WecannowextendtheAudioRecorderprojecttotesthowthefeaturepropertiesdelineatedearlierin
Table7-1affectaudioquality.Inthissection,wewilladdflags(seeListing7-10)toturnnoise
suppressionandautomaticgaincontrolonandoff.Figure7-2illustratesthenewuserinterfacechanges
wewillimplementinordertomanipulatesoundquality.
Figure7-2.TheRecorderwindowwithfeatureflagsindesignview
Listing7-10.FeatureFlags
bool _isNoiseSuppressionOn;
bool _isAutomaticGainOn;
UsingtheOnPropertyChangedhelperwecreatedpreviously,wecancreatebindableproperties
aroundthesefields.CreatethepropertiesinListing7-11.
238
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-11.FeatureProperties
public bool IsNoiseSuppressionOn
{
get
{
return _isNoiseSuppressionOn;
}
set
{
if (_isNoiseSuppressionOn != value)
{
_isNoiseSuppressionOn = value;
OnPropertyChanged("IsNoiseSuppressionOn");
}
}
}
public bool IsAutomaticGainOn
{
get{return _isAutomaticGainOn;}
set
{
if (_isAutomaticGainOn != value)
{
_isAutomaticGainOn = value;
OnPropertyChanged("IsAutomaticGainOn");
}
}
}
Next,addcheckboxestotheUIinordertotogglethesefeaturesonandoffatwill.Ispentseveral
hoursjusttryingoutdifferentsettings,recordingamessage,playingitback,andthenrerecordingthe
messagewithadifferentsetoffeatureconfigurationstoseewhathadchanged.Listing7-12showswhat
theXAMLshouldlooklike.Youwillwanttodragthecheckboxesaroundsotheydonotjuststackontop
ofeachother.
Listing7-12.FeatureFlagCheckBoxes
<CheckBox Content="Noise Suppression" IsChecked="{Binding IsNoiseSuppressionOn}"
Height="16" Width="110" />
<CheckBox Content="Automatic Gain Control" IsChecked="{Binding IsAutomaticGainOn}"
Height="16" Width="110" />
<CheckBox Content="Noise Fill" IsChecked="{Binding IsNoiseFillOn}"
Height="16" Width="110" />
Finally,wecanusetheseflagstochangethewaytheKinectAudioSourceisconfiguredinthe
CreateAudioSourcemethodshowninListing7-13.
239
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-13.CreateAudioSourcewithFeatureFlags
private KinectAudioSource CreateAudioSource()
{
var source = KinectSensor.KinectSensors[0].AudioSource;
source.BeamAngleMode = BeamAngleMode.Adaptive;
// set features based on user preferences
source.NoiseSuppression = _isNoiseSuppressionOn;
source.AutomaticGainControlEnabled = _isAutomaticGainOn;
}
return source;
Playwiththeseflagstoseehowtheyaffectyouraudiorecordings.Youwillnoticethatnoise
suppressionhasbyfarthemostobviouseffectonaudioquality.Automaticgaincontrolhasamore
noticeableeffectifyouwalkaroundtheroomasyourecordandexperimentwithraisingandlowering
yourvoice.Theotherfeaturesaremuchmoresubtle.Iwillleaveittotheindustriousreadertoadd
additionalcheckboxestotheUIinordertofindoutwhatthosefeaturesactuallydo.
CancelingAcousticEcho
AcousticechocancelingisnotsimplyafeatureoftheKinectAudioSourceclass,butrathersomethingat
thecoreoftheKinecttechnology.Testingitoutisconsequentlysomewhatmorecomplexthanplaying
withthefeatureflagsinthelastsection.
TotestAEC,addanothercheckboxtotheUIandtype“AEC”intothecheckbox’scontentattribute.
ThencreateanIsAECOnpropertymodeledonthepropertiesusedtosetthefeatureflags.Useaprivate
Booleanfieldcalled_isAECOnasthebackingfieldforthisproperty.Finally,bindthecheckbox’s
IsCheckedattributetotheIsAECOnpropertyyoujustcreated.
Aswedidabove,wewillconfigureAECintheCreateAudioSourcemethod.Itisabitmoreinvolved
however.Justabovethelinethatsays“returnsource,”addthecodeinListing7-14.First,theSystemMode
propertymustbechangedtospecifybothOptibeamandechocancelation.Automaticgaincontrol
needstobeturnedoffsinceitwillnotworkwithAEC.Additionally,wewillsettheAutomaticGainOn
propertytofalse,ifitisnotfalsealready,sotheUIshowsthatthereisaconflict.TheAECconfiguration
nextrequiresustofindboththemicrophoneweareusingaswellasthespeakerweareusingsotheAEC
algorithmsknowwhichoutgoingstreamtosubtractfromwhichingoingstream.Youcannowtestthe
acousticechocancelationcapabilitiesoftheKinectSDKbyplayingamediafilewhileyourecordyour
ownvoice.ACeeLoGreensongplayedextralouddidthetrickforme.
Listing7-14.ToggleIsAECOn
if (IsAECOn)
{
source.EchoCancellationMode = EchoCancellationMode.CancellationOnly;
source.AutomaticGainControlEnabled = false;
IsAutomaticGainOn = false;
source.EchoCancellationSpeakerIndex = 0;
}
240
www.it-ebooks.info
CHAPTER7SPEECH
NoteInbetaversionsoftheKinectSDK,AECusedpreliminarysamplingofthesoundfromthespeakerto
determinethelengthoftheechoandhowbesttoeliminateit.Thisawkwardlyrequiredthatsoundbeoutput
throughthespeakersbeforeAECwasturnedoninorderforittoworkcorrectly.InV1,thispeculiarissuehas
fortunatelybeenfixed.
BeamTrackingforaDirectionalMicrophone
It’spossibletousethefourmicrophonestogethertosimulatetheeffectofusingadirectional
microphone.Theprocessofdoingthatisreferredtoasbeamtracking.Wewillstartanewprojectin
ordertoexperimentwithbeamtracking.Hereiswhattodo:
1.
CreateanewWPFApplicationprojectcalledFindAudioDirection.
2.
AddareferencetoMicrosoft.Research.Kinect.dll.
3.
SettheTitlepropertyofthemainwindowto“FindAudioDirection”.
4.
Drawathin,verticalrectangleintherootgridoftheMainWindow.
Therectanglewillbeusedlikeadowsingrodtoindicatewherethespeakerisatanygivenpointin
time.Therectanglewillhavearotatetransformassociatedwithitsowecanswiveltheobjectonitsaxis
asillustratedinListing7-15.Inmycode,Ihavemadetherectangleblue.
Listing7-15.TheIndicator
<Rectangle Fill="#FF1B1BA7" HorizontalAlignment="Left" Margin="240,41,0,39"
Stroke="Black" Width="10" RenderTransformOrigin="0.5,0">
<Rectangle.RenderTransform>
<RotateTransform Angle="{Binding BeamAngle}"/>
</Rectangle.RenderTransform>
</Rectangle>
Figure7-3illustratestheratherstraightforwarduserinterfaceforthisproject.Theremainingcodeis
muchlikethecodeweusedintheAudioRecorderproject.WeinstantiatetheKinectAudioSourceobject
inprettymuchthesameway.TheDataContextofMainWindowissettoitselfagain.Wesetthe
BeamAngleModetoAdaptivesinceitwilltracktheuserautomatically.
241
www.it-ebooks.info
CHAPTER7SPEECH
Figure7-3.TheSpeechdirectionindicatorindesignview
OnechangeisthatweneedtoaddaneventhandlerfortheKinectaudiosource’sBeamChangedevent,
asshowninListing7-16.ThiswillfireeverytimetheSDKacknowledgesthattheuserhasmovedfrom
hispreviousposition.WealsoneedtocreateaBeamAngledoublepropertysotheRotateTransformonour
bluerectanglehassomethingtobindto.
242
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-16.MainWindow.csImplementation
public partial class MainWindow : Window, INotifyPropertyChanged
{
public MainWindow()
{
InitializeComponent();
this.DataContext = this;
this.Loaded += delegate {ListenForBeamChanges();};
}
private KinectAudioSource CreateAudioSource()
{
var source = KinectSensor.KinectSensors[0].AudioSource;
source.NoiseSuppression = true;
source.AutomaticGainControlEnabled = true;
source.BeamAngleMode = BeamAngleMode.Adaptive;
return source;
}
private KinectAudioSource audioSource;
private void ListenForBeamChanges()
{
KinectSensor.KinectSensors[0].Start();
var audioSource = CreateAudioSource();
audioSource.BeamChanged += audioSource_BeamAngleChanged;
audioSource.Start();
}
public event PropertyChangedEventHandler PropertyChanged;
private void OnPropertyChanged(string propName)
{
if (PropertyChanged != null)
PropertyChanged(this, new PropertyChangedEventArgs(propName));
}
private double _beamAngle;
public double BeamAngle
{
get { return _beamAngle; }
set
{
_beamAngle = value;
OnPropertyChanged("BeamAngle");
}
}
// ...
243
www.it-ebooks.info
CHAPTER7SPEECH
ThefinalpiecethattiesallofthistogetheristheBeamChangedeventhandler.Thiswillbeusedto
modifytheBeamAnglepropertywheneverthebeamdirectionchanges.Whileinsomeplaces,theSDK
usesradianstorepresentangles,theBeamChangedeventconvenientlytranslatesradiansintodegreesfor
us.Thisstilldoesnotquiteachievetheeffectwewantsincewhenthespeakermovestotheleft,fromthe
Kinectsensor’sperspective,ourrectanglewillappeartoswivelintheoppositedirection.Toaccountfor
this,wereversethesignoftheangleasdemonstratedinListing7-17.
Listing7-17.BeamChangedEventHandler
void audioSource_BeamChanged(object sender, BeamAngleChangedEventArgs e)
{
BeamAngle = e.Angle * -1;
}
Asyouplaywiththisproject,trytotalkconstantlywhilewalkingaroundtheroomtoseehow
quicklytheadaptivebeamcantrackyou.Keepinmindthatitcanonlytrackyouifyouaretalking.A
nearbytelevision,Ihavediscovered,canalsofooltheadaptivebeamregardingyourlocation.
SpeechRecognition
Inthissection,wewillfinallycombinethepoweroftheKinectAudioSourcewiththeclevernessofthe
SpeechRecognitionEngine.Toillustratehowspeechcommandscanbeusedeffectivelywithskeleton
tracking,wewillattempttoreplicateChrisSchmandt’spioneeringworkonNUIinterfacesfrom1979.
Youcanfindavideoofhisproject,“PutThatThere,”onYouTube.Theoriginalscriptofthe“PutThat
There”demonstrationwentsomethinglikethis.
Createayellowcircle,there.
Createacyantriangle,there.
Putamagentasquare,there.
Createabluediamond,there.
Movethat...there.
Putthat...there.
Movethat...belowthat.
Movethat...westofthediamond.
Putalargegreencircle...there.
Wewillnotbeabletoreplicateeverythinginthatshortvideointheremainingpagesofthischapter,
butwewillatleastreproducetheaspectofitwhereChrisisabletocreateanobjectonthescreenusing
handmanipulationsandaudiblecommands.Figure7-4showswhattheversionofPutThatTherethat
weareabouttoconstructlookslike.
244
www.it-ebooks.info
CHAPTER7SPEECH
Figure7-4.PutThatThereUI
UsethefollowingguidancetostartthePutThatThereapplication.Inthesesteps,wewillcreatethe
basicproject,makeitKinectenabled,andthenaddausercontroltoprovideanaffordanceforhand
tracking.
1.
CreateanewWPFApplicationprojectcalledPutThatThere.
2.
AddareferencetoMicrosoft.Research.Kinect.dll.
3.
AddareferencetoMicrosoft.Speech.TheMicrosoft.Speechassemblycanbe
foundinC:\ProgramFiles(x86)\MicrosoftSpeechPlatformSDK\Assembly.
4.
SettheTitlepropertyofthemainwindowto“PutThatThere”.
5.
CreateanewUserControlcalledCrossHairs.xaml.
TheCrossHairsusercontrolissimplyadrawingofcrosshairsthatwecanusetotrackthe
movementsoftheuser’srighthand,justlikeinChrisSchmandt’svideo.Ithasnobehaviors.Listing7-18
showswhattheXAMLshouldlooklike.Youwillnoticethatweoffsetthecrosshairsfromthecontainerin
ordertohaveourtworectanglescrossatthezero,zerogridposition.
Listing7-18.Crosshairs
<Grid Height="50" Width="50" RenderTransformOrigin="0.5,0.5">
<Grid.RenderTransform>
<TranslateTransform X="-25" Y="-25"/>
</Grid.RenderTransform>
<Rectangle Fill="#FFF4F4F5" Margin="20,0,20,0" Stroke="#FFF4F4F5"/>
<Rectangle Fill="#FFF4F4F5" Margin="0,20,0,20" Stroke="#FFF4F4F5"/>
</Grid>
IntheMainWindow,changetherootgridtoacanvas.Acanvaswillmakeiteasierforustoanimatethe
crosshairstomatchthemovementsoftheuser’shand.DropaninstanceoftheCrossHairscontrolinto
thecanvas.Youwillnoticeinlisting7-20thatwealsonesttherootcanvaspanelinsideaViewboxcontrol.
Thisisanoldtricktohandledifferentresolutionscreens.Theviewboxwillautomaticallyresizeits
contentstomatchthescreenrealestateavailable.SetthebackgroundoftheMainWindowaswellasthe
backgroundofthecanvastothecolorblack.Wewillalsoaddtwolabelstothebottomofthecanvas.One
245
www.it-ebooks.info
CHAPTER7SPEECH
willdisplayhypothesizedtextastheSpeechRecognitionEngineattemptstointerpretitwhiletheother
willdisplaytheconfidencewithwhichtheengineratesthecommandsithears.Thepositionofthe
CrossHairscontrolwillbeboundtotwoproperties,HandTopandHandLeft.Thecontentattributesofthe
twolabelswillbeboundtoHypothesizedTextandConfidence,respectively.Ifyouarenotoverlyfamiliar
withXAMLsyntax,youcanjustpastethecodeinListing7-19asyouseeit.WearedonewithXAMLfor
now.
Listing7-19.MainWindowXAML
xmlns:local="clr-namespace:PutThatThere"
Title="Put That There" Background="Black">
<Viewbox>
<Canvas x:Name="MainStage" Height="1080" Width="1920" Background="Black"
VerticalAlignment="Bottom">
<local:CrossHairs Canvas.Top="{Binding HandTop}" Canvas.Left="{Binding HandLeft}" />
<Label Foreground="White" Content="{Binding HypothesizedText}" Height="55" Width="965"
FontSize="32" Canvas.Left="115" Canvas.Top="1025" />
<Label Foreground="Green" Content="{Binding Confidence}" Height="55" Width="114"
FontSize="32" Canvas.Left="0" Canvas.Top="1025" />
</Canvas>
</Viewbox>
InMainWindow.cswewill,asinpreviousprojectsinthischapter,makeMainWindowimplement
INotifyPropertyChangedandaddanOnPropertyChangedhelpermethod.ConsultListing7-20ifyourun
intoanyproblems.WewillalsocreatethefourpropertiesthatourUIneedstobindto.
Listing7-20.MainWindow.csImplementation
public partial class MainWindow : Window, INotifyPropertyChanged
{
public event PropertyChangedEventHandler PropertyChanged;
private void OnPropertyChanged(string propertyName)
{
if (PropertyChanged != null)
{
PropertyChanged(this, new PropertyChangedEventArgs(propertyName));
}
}
private double _handLeft;
public double HandLeft
{
get { return _handLeft; }
set
{
_handLeft = value;
OnPropertyChanged("HandLeft");
}
}
246
www.it-ebooks.info
CHAPTER7SPEECH
private double _handTop;
public double HandTop
{
get { return _handTop; }
set
{
_handTop = value;
OnPropertyChanged("HandTop");
}
}
private string _hypothesizedText;
public string HypothesizedText
{
get { return _hypothesizedText; }
set
{
_hypothesizedText = value;
OnPropertyChanged("HypothesizedText");
}
}
private string _confidence;
public string Confidence
{
get { return _confidence; }
set
{
_confidence = value;
OnPropertyChanged("Confidence");
}
}
// ...
AddtheCreateAudioSourcemethodasshowninListing7-21.ForCreateAudioSource,beawarethat
AutoGainControlEnabledcannotbesettotrueasthisinterfereswithspeechrecognition.Itissettofalse
bydefault.
Listing7-21.CreateAudioSourceandLaunchAsMTA
private KinectAudioSource CreateAudioSource()
{
var source = KinectSensor.KinectSensors[0].AudioSource;
source.AutomaticGainControlEnabled = false;
source.EchoCancellationMode = EchoCancellationMode.None;
return source;
}
247
www.it-ebooks.info
CHAPTER7SPEECH
Thistakescareofthebasics.Wenextneedtosetupskeletontrackinginordertotracktheright
hand.CreateafieldlevelvariableforthecurrentKinectSensorinstancecalled_kinectSensor,asshown
inListing7-23.Alsodeclareaconstantstringtospecifytherecognizeridentifierusedforspeech
recognitionwithKinect.WewillstartboththeNUIruntimeaswellastheSpeechRecognitionEnginein
theconstructorforMainWindow.Additionally,wewillcreatehandlersfortheNUIruntimeskeletonevents
andsetMainWindow’sdatacontexttoitself.
Listing7-22.InitializetheKinectSensorandtheSpeechRecognitionEngine
KinectSensor _kinectSensor;
SpeechRecognitionEngine _sre;
KinectAudioSource _source;
public MainWindow()
{
InitializeComponent();
this.DataContext = this;
this.Unloaded += delegate
{
_kinectSensor.SkeletonStream.Disable();
_sre.RecognizeAsyncCancel();
_sre.RecognizeAsyncStop();
};
this.Loaded += delegate
{
_kinectSensor = KinectSensor.KinectSensors[0];
_kinectSensor.SkeletonStream.Enable(new TransformSmoothParameters());
_kinectSensor.SkeletonFrameReady += nui_SkeletonFrameReady;
_kinectSensor.Start();
StartSpeechRecognition();
};
}
Inthecodeabove,wepassanewTransformSmoothParametersobjecttotheskeletonstream’sEnable
methodinordertoremovesomeoftheshakinessthatcanaccompanyhandtracking.The
nui_SkeletonFrameReadyeventhandlershownbelowinListing7-23usesskeletontrackingdatatofind
thelocationofjustthejointweareinterestedin:therighthand.Youshouldalreadybefamiliarwith
otherversionsofthiscodefrompriorchapters.Basically,weiteratethroughanyskeletonstheskeleton
trackeriscurrentlyfollowing.Wepulloutthevectorinformationfortherighthandjoint.Wethenextract
thecurrentrelativeXandYcoordinatesoftherighthandusingSkeletonToDepthImage,andmassage
thesecoordinatestomatchthesizeofourscreen.
248
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-23.HandTracking
void nui_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
using (SkeletonFrame skeletonFrame = e.OpenSkeletonFrame())
{
if (skeletonFrame == null)
return;
var skeletons = new Skeleton[skeletonFrame.SkeletonArrayLength];
skeletonFrame.CopySkeletonDataTo(skeletons);
foreach (Skeleton skeletonData in skeletons)
{
if (skeletonData.TrackingState == SkeletonTrackingState.Tracked)
{
Microsoft.Kinect.SkeletonPoint rightHandVec =
skeletonData.Joints[JointType.HandRight].Position;
Microsoft.Kinect.SkeletonPoint rightHandVec =
skeletonDataJoints[JointType.HandRight].Position
var depthPoint = _kinectSensor.MapSkeletonPointToDepth(rightHandVec
, DepthImageFormat.Resolution640x480Fps30);
HandTop = depthPoint.Y * this.MainStage.ActualHeight/480;
HandLeft = depthPoint.X * this.MainStage.ActualWidth/640;
}
}
}
}
Thisisallthatisrequiredtohavehandtrackinginourapplication.WheneverwesettheHandTopand
HandLeftproperties,theUIisupdatedandthecrosshairschangeposition.
Wehaveyettosetupspeechrecognition.TheStartSpeechRecognitionmethodmustfindthe
correctrecognizerforKinectspeechrecognitionandapplyittotheSpeechRecognitionEngine.Listing
7-24demonstrateshowthisoccursandalsohowweconnectthingsupsotheKinectAudioSourcepasses
itsdatatotherecognitionengine.WeaddhandlersfortheSpeechRecognizedevent,the
SpeechHypothesizedevent,andtheSpeechRejectedevent.ThespecificvaluesforSetInputToAudioStream
aresimplyboilerplateandnotreallysomethingtoworryabout.Pleasenotethatwhilethe
SpeechRecognitionEngineandtheKinectAudioSourcearedisposabletypes,weactuallyneedtokeep
themopenforthethelifetimeoftheapplication.
249
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-24.StartSpeechRecognitionMethod
private void StartSpeechRecognition()
{
_source = CreateAudioSource();
Func<RecognizerInfo, bool> matchingFunc = r =>
{
string value;
r.AdditionalInfo.TryGetValue("Kinect", out value);
return "True".Equals(value, StringComparison.InvariantCultureIgnoreCase)
&& "en-US".Equals(r.Culture.Name, StringComparison.InvariantCultureIgnoreCase);
};
RecognizerInfo ri = SpeechRecognitionEngine.InstalledRecognizers()
.Where(matchingFunc).FirstOrDefault();
_sre = new SpeechRecognitionEngine(ri.Id);
CreateGrammars(ri);
_sre.SpeechRecognized += sre_SpeechRecognized;
_sre.SpeechHypothesized += sre_SpeechHypothesized;
_sre.SpeechRecognitionRejected += sre_SpeechRecognitionRejected;
}
Stream s = _source.Start();
_sre.SetInputToAudioStream(s,
new SpeechAudioFormatInfo(
EncodingFormat.Pcm, 16000, 16, 1,
32000, 2, null));
_sre.RecognizeAsync(RecognizeMode.Multiple);
TofinishPutThatThere,westillneedtofillintherecognitioneventhandlersandalsoputin
grammarlogicsotherecognitionengineknowshowtoprocessourcommands.Therejectedand
hypothesizedeventhandlersinListing7-25forthemostpartjustupdatethelabelsinourpresentation
layerandarefairlystraightforward.Thesre_SpeechRecognizedeventhandlerisslightlymore
complicatedinthatitisresponsiblefortakingthecommandspassedtoitandfiguringoutwhattodo.
Additionally,sincepartofitstaskistocreateGUIobjects(inthecodebelowthisisshelledouttothe
InterpretCommandmethod),wemustusetheDispatcherinordertorunInterpretCommandbackonthe
mainGUIthread.
250
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-25.SpeechEventHandlers
void sre_SpeechRecognitionRejected(object sender, SpeechRecognitionRejectedEventArgs e)
{
HypothesizedText += " Rejected";
Confidence = Math.Round(e.Result.Confidence, 2).ToString();
}
void sre_SpeechHypothesized(object sender, SpeechHypothesizedEventArgs e)
{
HypothesizedText = e.Result.Text;
}
void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
Dispatcher.BeginInvoke(new Action<SpeechRecognizedEventArgs>(InterpretCommand),e);
}
Wenowcometothemeatoftheapplication:creatingourgrammarandinterpretingit.Thepurpose
ofPutThatThereistorecognizeageneralphrasethatbeginswitheither“put”or“create.”Thisis
followedbyanarticle,whichwedonotcareabout.Thenextwordinthecommandshouldbeacolor,
followedbyashape.Thelastwordinthecommandphraseshouldbe“there.”Listing7-26showshowwe
taketheserulesandcreateagrammarforit.
First,wecreateanyChoicesobjectsthatwillneedtobeusedinourcommandphrase.ForPutThat
There,weneedcolorsandshapes.Additionally,thefirstwordcanbeeither“put”or“create,”sowe
createaChoicesobjectforthemalso.Wethenstringourcommandtermstogetherusingthe
GrammarBuilderclass.First,“put”or“create,”thenawildcardbecausewedonotcareaboutthearticle,
thenthecolorsChoicesobject,thentheshapesChoicesobject,andfinallythesingleword“there.”
Weloadthisgrammarintothespeechrecognitionengine.Aspointedoutabove,wealsoneedaway
tostoptherecognitionengine.Wecreateasecondgrammarwithjustonecommand,“Quit,”andload
thisintothespeechrecognitionenginealso.
251
www.it-ebooks.info
CHAPTER7SPEECH
Listing7-26.BuildingaComplexGrammar
private void CreateGrammars(RecognizerInfo ri)
{
var colors = new Choices();
colors.Add("cyan");
colors.Add("yellow");
colors.Add("magenta");
colors.Add("blue");
colors.Add("green");
colors.Add("red");
var create = new Choices();
create.Add("create");
create.Add("put");
var shapes = new Choices();
shapes.Add("circle");
shapes.Add("triangle");
shapes.Add("square");
shapes.Add("diamond");
var gb = new GrammarBuilder();
gb.Culture = ri.Culture;
gb.Append(create);
gb.AppendWildcard();
gb.Append(colors);
gb.Append(shapes);
gb.Append("there");
var g = new Grammar(gb);
_sre.LoadGrammar(g);
var q = new GrammarBuilder();
q.Append("quit application");
var quit = new Grammar(q);
_sre.LoadGrammar(quit);
}
Oncethespeechrecognitionenginedeterminesthatitrecognizesaphrase,therealworkbegins.
Therecognizedphrasemustbeparsedandwemustdecidewhattodowithit.ThecodeshowninListing
7-27readsthroughthesequenceofwordobjectspassedtoitandbeginsbuildinggraphicobjectstobe
placedonthescreen.
Wefirstchecktoseeifthecommand“Quit”waspassedtotherecognizedmethod.Ifitwas,thereis
noreasontocontinue.Generally,however,wearebeingpassedmorecomplexcommandphrases.
TheInterpretCommandsmethodchecksthefirstwordtoverifythateither“create”or“put”was
uttered.Ifforsomereasonsomethingelseisthefirstword,wethrowoutthecommand.Ifthefirstword
ofthephraseiscorrect,wegoontothethirdphraseandcreateacolorobjectbasedonthetermwe
receive.Ifthethirdwordisnotrecognized,theprocessends.Otherwise,weproceedtothefourthword
andbuildashapebasedonthecommandwordwereceive.Atthispoint,wearedonewiththe
252
www.it-ebooks.info
CHAPTER7SPEECH
interpretationprocesssinceweneededthefifthwordjusttomakesurethephrasewassaidcorrectly.
TheXandYcoordinatesofthecurrenthandpositionareretrievedandaspecifiedshapeofaspecified
hueiscreatedatthatlocationontherootpanelofMainWindow.
Listing7-27.InterpretingCommands
private void InterpretCommand(SpeechRecognizedEventArgs e)
{
var result = e.Result;
Confidence = Math.Round(result.Confidence,2).ToString();
if (result.Words[0].Text == "quit")
{
_isQuit = true;
return;
}
if (result.Words[0].Text == "put" || result.Words[0].Text == "create")
{
var colorString = result.Words[2].Text;
Color color;
switch (colorString)
{
case "cyan": color = Colors.Cyan;
break;
case "yellow": color = Colors.Yellow;
break;
case "magenta": color = Colors.Magenta;
break;
case "blue": color = Colors.Blue;
break;
case "green": color = Colors.Green;
break;
case "red": color = Colors.Red;
break;
default:
return;
}
var shapeString = result.Words[3].Text;
Shape shape;
switch (shapeString)
{
case "circle":
shape = new Ellipse();
shape.Width = 150;
shape.Height = 150;
break;
case "square":
shape = new Rectangle();
shape.Width = 150;
shape.Height = 150;
break;
253
www.it-ebooks.info
CHAPTER7SPEECH
case "triangle":
var poly = new Polygon();
poly.Points.Add(new Point(0, 0));
poly.Points.Add(new Point(150, 0));
poly.Points.Add(new Point(75, -150));
shape = poly;
break;
case "diamond":
var poly2 = new Polygon();
poly2.Points.Add(new Point(0, 0));
poly2.Points.Add(new Point(75, 150));
poly2.Points.Add(new Point(150, 0));
poly2.Points.Add(new Point(75, -150));
shape = poly2;
break;
default:
return;
}
shape.SetValue(Canvas.LeftProperty, HandLeft);
shape.SetValue(Canvas.TopProperty, HandTop);
shape.Fill = new SolidColorBrush(color);
MainStage.Children.Add(shape);
}
}
Inastrangeway,thisprojectgetsusbacktotheoriginsofNUIdesign.Itisaconceptthatwas
devisedevenbeforemousedeviceshadbecomewidespreadandinauguratedtheGUIrevolutionofthe
90s.Bothgesturalandspeechmetaphorshavebeenaroundforalongtimebothinthemoviesandinthe
laboratory.Itisarelieftofinallygetthemintopeople’shomesandofficesnearlythirtyyearsafterChris
SchmandtfilmedPutThatThere.Andwithluck,thereitwillstay.
Summary
AswiththeskeletontrackingcapabilitiesofKinect,theaudiocapabilitiesofKinectprovidepowerful
toolsnotpreviouslyavailabletoindependentdevelopers.Inthischapter,youdelvedintotheoftenoverlookedaudiocapabilitiesoftheKinectsensor.Youlearnedhowtomanipulatetheadvanced
propertiesoftheKinectAudioSource.Youalsolearnedhowtoinstallandusetheappropriatespeech
recognitionlibrariestocreateanaudiopipelinethatrecognizesspeechcommands.
YoubuiltseveralaudioapplicationsdemonstratinghowtouseKinect’sbuilt-inmicrophonearray.
YouconfiguredtheKinecttoimplementbeamformingandtrackedauserwalkingaroundtheroom
basedonlyonhisvoice.YoualsocombinedthespeechrecognitionpowerofKinectwithitsskeleton
trackingcapabilitiestocreateacomplexmulti-modalinterface.Usingtheseapplicationsasastarting
point,youwillbeabletointegratespeechcommandsintoyourKinect-basedapplicationstocreate
noveluserexperiences.
254
www.it-ebooks.info
CHAPTER 8
Beyond the Basics
IntheprefacetohisbookTheOrderofThings,thephilosopherMichelFoucaultcreditsGeorgesLuis
Borgesforinspiringhisresearchwithapassageabout“acertainencyclopaedia”inwhichitiswritten
that“animalsaredividedinto:(a)belongingtotheEmperor,(b)embalmed,(c)tame,(d)sucklingpigs,
(e)sirens,(f)fabulous,(g)straydogs,(h)includedinthepresentclassification,(i)frenzied,(j)
innumerable,(k)drawnwithaveryfinecamelhairbrush,(l)etcetera,(m)havingjustbrokenthewater
pitcher,(n)thatfromalongwayofflooklikeflies.”ThissupposedChineseEncyclopediacitedbyBorges
wascalledtheCelestialEmporiumofBenevolentKnowledge.
AfterdoingourbesttobreakdownthevariousaspectsoftheKinectSDKintoreasonablyclassified
chunksofbenevolentknowledgeintheprevioussevenchapters,theauthorsofthepresentvolumehave
finallyreachedtheetceterachapterwherewetrytocoverahodgepodgeofthingsremainingaboutthe
KinectSDKthathasnotyetbeenaddressedthematically.Inadifferentsortofbookthischaptermight
havebeenentitledSaucesandPickles.Werewemorehonest,wewouldsimplycallitetcetera(or
possiblyeventhingsthatfromalongwayofflooklikeflies).Followingtheestablishedtraditionof
technicalbooks,however,wehavechosentocallitBeyondtheBasics.
Thereaderwillhavenoticedthatafterlearningtousethevideostream,depthcamera,skeleton
tracking,microphonearray,andspeechrecognitioninthepriorchapters,sheisstilladistanceaway
frombeingabletoproducethesortsofKinectexperiencesseenonYouTube.TheKinectSDKprovides
justabouteverythingtheotheravailableKinectlibrariesofferandincertaincasesmuchmore.Inorder
totakeKinectforPCprogrammingtothenextlevel,however,itisnecessarytoapplycomplex
mathematicsaswellascombineKinectwithadditionallibrariesnotdirectlyrelatedtoKinect
programming.ThetruepotentialofKinectisactualizedonlywhenitiscombinedwithother
technologiesintoasortofmashup.
Inthischapter,youexploresomeadditionalsoftwarelibrariesavailabletohelpyouworkwithand
manipulatethedataprovidedbytheKinectsensor.Abitlikeducttapingpipestogether,youcreate
mashupsofdifferenttechnologiestoseewhatyoucanreallydowithKinect.Ontheotherhand,when
youleavethesafetyofanSDKdesignedforonepurpose,codecanstarttogetmessy.Thepurposeofthis
chapterisnottoprovideyouwithready-madecodeforyourprojectsbutrathersimplytoprovideataste
ofwhatmightbepossibleandoffersomeguidanceonhowtoachieveit.Thegreatestdifficultywith
programmingforKinectisgenerallynotanysortoftechnicallimitation,butratheralackofknowledge
aboutwhatisavailabletobeworkedwith.Onceyoustarttounderstandwhatisavailable,thepossible
applicationsofKinecttechnologymayevenseemoverwhelming.
Inanefforttoprovidebenevolentknowledgeaboutthesevariousadditionallibraries,Iruntherisk
ofcoveringcertainlibrariesthatwillnotlastandoffailingtodulycoverotherlibrariesthatwillturnout
tobemuchmoreimportantforKinecthackersthantheycurrentlyare.TheworldofKinectdevelopment
isprogressingsorapidlythatthisdangercanhardlybeavoided.However,bydiscussinghelperlibraries,
imageprocessinglibraries,etcetera,Ihopeatleasttoindicatewhatsortsofsoftwarearevaluableand
interestingtotheKinectdeveloper.Overthenextyear,shouldalternativelibrariesturnouttobebetter
255
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
thantheonesIwriteabouthere,itismyhopethatthecurrentdiscussion,whilenotattendingtothem
directly,willatleastpointthewaytothosebetterthird-partylibraries.
InthischapterIwilldiscussseveraltoolsyoumightfindhelpfulincludingtheCoding4FunKinect
Toolkit,Emgu(theC#wrapperforacomputervisionlibrarycalledOpenCV),andBlender.Iwillvery
brieflytouchona3DgamingframeworkcalledUnity,thegesturemiddlewareFAAST,andtheMicrosoft
RoboticsDeveloperStudio.Eachoftheserichtoolsdeservesmorethanamerementionbuteach,
unfortunately,isoutsidethescopeofthischapter.
ThestructureofBeyondtheBasicsispragmatic.Togetherwewillwalkthroughhowtobuildhelper
librariesandproximityandmotiondetectors.Thenwe’llmoveintofacedetectionapplications.Finally
we’llbuildsomesimulatedholograms.Alongtheway,youwillpickupskillsandknowledgeabouttools
forimagemanipulationthatwillserveasthebuildingblocksforbuildingevenmoresophisticated
applicationsonyourown.
ImageManipulationHelperMethods
Therearemanydifferentkindsofimagesandmanylibrariesavailabletoworkwiththem.Inthe.NET
Frameworkalone,bothaSystem.Windows.Media.DrawingabstractclassaswellasaSystem.Drawing
namespaceisprovidedbelongingtoPresentationCore.dllandSystem.Drawing.dllrespectively.To
complicatethingsalittlemore,bothSystem.WindowsandSystem.Drawingnamespacescontainclasses
relatedtoshapesandcolorsthatareindependentofoneanother.Sometimesmethodsinonelibrary
allowforimagemanipulationsnotavailableintheother.Totakeadvantageofthem,itmaybenecessary
toconvertimagesofonetypetoimagesofanotherandthenbackagain.
WhenwethrowKinectintothemix,thingsgetexponentiallymorecomplex.Kinecthasitsown
imagetypesliketheImageFrame.InordertomaketypeslikeImageFrameworkwithWPF,theImageFrame
mustbeconvertedintoanImageSourcetype,whichispartoftheSystem.Windows.Media.Imaging
namespace.Third-partyimagemanipulationlibrarieslikeEmgudonotknowanythingaboutthe
System.Windows.Medianamespace,butdohaveknowledgeoftheSystem.Drawingnamespace.Inorderto
workwithKinectandEmgu,then,itisnecessarytocovertMicrosoft.Reseach.Kinecttypesto
System.Drawingtypes,convertSystem.DrawingtypestoEmgutypes,convertEmgutypesbackto
System.Drawingtypesaftersomemanipulations,andthenfinallyconvertthesebackto
System.Windows.MediatypessoWPFcanconsumethem.
TheCoding4FunKinectToolkit
ClintRutkas,DanFernandez,andBrianPeekhaveputtogetheralibrarycalledtheCoding4FunKinect
Toolkitthatprovidessomeoftheconversionsnecessaryfortranslatingclasstypesfromonelibraryto
another.Thetoolkitcanbedownloadedathttp://c4fkinect.codeplex.com.Itcontainsthreeseparate
dlls.TheCoding4Fun.Kinect.Wpflibrary,amongotherthings,providesasetofextensionmethodsfor
workingbetweenMicrosoft.KinecttypesandSystem.Windows.Mediatypes.The
Coding4Fun.Kinect.WinFormlibraryprovidesextensionmethodsfortransformingMicrosoft.Kinect
typesintoSystem.Drawingtypes.System.Drawingistheunderlying.NETgraphicslibraryprimarilyfor
WinFormsdevelopmentjustasSystem.Windows.MediacontainstypesforWPF.
TheunfortunatethingabouttheCoding4FunKinectToolkitisthatitdoesnotprovidewaysto
converttypesbetweenthoseintheSystem.DrawingnamespaceandthoseinSystem.Windows.Media
namespace.ThisisbecausethegoaloftheToolkitinthefirstiterationappearstobetoprovidewaysto
simplifywritingKinectdemocodefordistributionratherthantoprovideageneral-purposelibraryfor
workingwithimagetypes.Consequently,somemethodsonemightneedforWPFprogrammingare
containedinadllcalledWinForms.Moreover,usefulmethodsforworkingwiththeverycomplexdepth
256
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
imagedatafromthedepthstreamarelockedawayinsideamethodthatsimplytransformsaKinect
imagetypetoaWPFImageSourceobject.
TherearetwogreatthingsabouttheCoding4FunKinectToolkitthatmakenegligibleanyquibbling
criticismsImighthaveconcerningit.First,thesourcecodeisbrowsable,allowingustostudythe
techniquesusedbytheCoding4Funteamtoworkwiththebytearraysthatunderlietheimage
manipulationstheyperform.Whileyouhaveseenalotofsimilarcodeinthepreviouschapters,itis
extremelyhelpfultoseethesmalltweaksusedtocomposethesetechniquesintosimpleone-call
methods.Second,theCoding4Funteambrilliantlydecidedtostructurethesemethodsasextension
methods.
Incaseyouareunfamiliarwithextensionmethods,theyaresimplysyntacticsugarthatallowsa
stand-alonemethodtolooklikeithasbeenattachedtoapreexistingclasstype.Forinstance,youmight
haveaC#methodcalledAddOnethataddsonetoanyinteger.Thismethodcanbeturnedintoan
extensionmethodhangingofftheIntegertypesimplybymakingitastaticmethod,placingitinatoplevelstaticclass,andaddingthekeywordthistothefirstparameterofthemethod,asshowninListing
8-1.Oncethisisdone,insteadofcallingAddOne(3)togetthevaluefour,wecaninsteadcall3.AddOne().
Listing8-1.TurningNormalMethodsIntoExtensionMethods
public int AddOne(int i)
{
return i + 1;
}
// becomes the extension method
public static class myExtensions
{
public static int AddOne(this int i)
{
return i + 1;
}
}
Touseanextensionmethodlibrary,allyouhavetodoisincludethenamespaceassociatedwiththe
methodsinthenamespacedeclarationofyourowncode.Thenameofthestaticclassthatcontainsthe
extensions(MyExtensionsinthecaseabove)isactuallyignored.Whenextensionmethodsareusedto
transformimagetypesfromonelibrarytoimagetypesfromanother,theysimplifyworkwithimagesby
lettingusperformoperationslike:
var bitmapSource = imageFrame.ToBitmapSource();
image1.Source = bitmapSource;
Table8-1outlinestheextensionmethodsprovidedbyversion1.0oftheCoding4FunKinectToolkit.
YoushouldusethemasastartingpointfordevelopingapplicationswiththeKinectSDK.Asyoubuildup
experience,however,youshouldconsiderbuildingyourownlibraryofhelpermethods.Inpart,thiswill
aidyouasyoudiscoverthatyouneedhelperstheCoding4Funlibrariesdonotprovide.Moreimportant,
becausetheCoding4Funmethodshidesomeofthecomplexityinvolvedinworkingwithdepthimage
data,youmayfindthattheydonotalwaysdowhatyouexpectthemtodo.Whilehidingcomplexityis
admittedlyoneofthemainpurposesofhelpermethods,youwilllikelyfeelconfusedtofindthat,when
workingwiththedepthstreamandtheCoding4FunToolkit,e.ImageFrame.ToBitmapSource()returns
somethingsubstantiallydifferentfrom
257
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
e.ImageFrame.Image.Bits.ToBitmapSource(e.ImageFrame.Image.Width, e.ImageFrame.Image.Height).
BuildingyourownextensionmethodsforworkingwithimageswillhelpsimplifydevelopingwithKinect
whilealsoallowingyoutoremainawareofwhatyouareactuallydoingwiththedatastreamscoming
fromtheKinectsensor.
Table8-1.Coding4FunKinectToolkit1.0ExtensionMethods
Library
Method Name
Coding4Fun.Kinect.WpfGetMidp
Coding4Fun.Kinect.WpfSa
oint
ve
Coding4Fun.Kinect.WpfToBitm
Extended Type
Output
short[]
System.Windows.Point
BitmapSource
void
apSourcebyte[]
BitmapSource
Coding4Fun.Kinect.WpfTo
BitmapSource
DepthImageFrame
BitmapSource
Coding4Fun.Kinect.WpfTo
BitmapSource
ColorImageFrame
BitmapSource
Coding4Fun.Kinect.WpfToBitm
apSourceshort[]
Coding4Fun.Kinect.WpfToDepthA
rrayDept
BitmapSource
hImageFrame
short[]
Coding4Fun.Kinect.WinFormGetMidp oint
short[]
Coding4Fun.Kinect.WinFormSa ve
System.Drawing.Bitmap void
Coding4Fun.Kinect.WinFormSc aleTo
Joint
Joint
Coding4Fun.Kinect.WinFormTo Bitmap
byte[]
System.Drawing.Bitmap
Coding4Fun.Kinect.WinFormTo Bitmap
DepthImageFrame
System.Drawing.Bitmap
Coding4Fun.Kinect.WinFormTo Bitmap
ColorImageFrame
System.Drawing.Bitmap
Coding4Fun.Kinect.WinFormTo Bitmap
short[]
System.Drawing.Bitmap
Coding4Fun.Kinect.WinFormToDept hArrayDept
hImageFrame
System.Windows.Point
short[]
YourOwnExtensionMethods
Wecanbuildourownextensionmethods.InthischapterIwalkyouthroughtheprocessofbuildinga
setofextensionmethodsthatwillbeusedfortheimagemanipulationprojects.Thechiefpurposeof
thesemethodsistoallowustoconvertimagesfreelybetweentypesfromtheSystem.Drawing
namespace,whicharemorecommonlyused,andtypesintheSystem.Windows.Medianamespace,
whichtendtobespecifictoWPFprogramming.Thisinturnprovidesabridgebetweenthird-party
libraries(andevenfoundcodeontheInternet)andtheWPFplatform.Theseimplementationsare
258
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
simplystandardimplementationsforworkingwithBitmapandBitmapSourceobjects.Someofthemare
alsofoundintheCoding4FunKinectToolkit.Ifyoudonotfeelinclinedtowalkthroughthiscode,you
cansimplycopytheimplementationfromthesamplecodeassociatedwiththischapterandskipahead.
Insteadofcreatingaseparatelibraryforourextensionmethods,wewillsimplycreateaclassthat
canbecopiedfromprojecttoproject.Theadvantageofthisisthatallthemethodsarewellexposedand
canbeinspectedifcodeyouexpecttoworkonewayendsupworkinginanentirelydifferentway(a
commonoccurrencewithimageprocessing).
CreateaWPFProject
NowwearereadytocreateanewsampleWPFprojectinwhichwecanconstructandtesttheextension
methodsclass.WewillbuildaMainWindow.xamlpagesimilartotheoneinListing8-2withtwoimages,
onecalledrgbImageandonecalleddepthImage.
Listing8-2.ExtensionMethodsSamplexamlPage
<Window x:Class="ImageLibrarySamples.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="Image Library Samples" >
<Grid>
<Grid.ColumnDefinitions>
<ColumnDefinition/>
<ColumnDefinition/>
</Grid.ColumnDefinitions>
<Image Name="rgbImage" Stretch="Uniform" Grid.Column="0"/>
<Image Name="depthImage" Stretch="Uniform" Grid.Column="1"/>
</Grid>
</Window>
Thisprocessshouldfeelsecondnaturetoyoubynow.Forthecode-behind,addareferenceto
Microsoft.Kinect.dll.DeclareaMicrosoft.Kinect.KinectSensormemberandinstantiateitinthe
MainWindowconstructor,asshowninListing8-3(andasyouhavealreadydoneadozentimesifyouhave
beenworkingthroughtheprojectsinthisbook).InitializetheKinectSensorobject,handlethe
VideoFrameReadyandDepthFrameReadyevents,andthenopenthevideoanddepthstreams,thelatter
withoutplayerdata.
259
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Listing8-3.ExtensionMethodsSampleMainWindowCode-Behind
Microsoft.Kinect.KinectSensor _kinectSensor;
public MainWindow()
{
InitializeComponent();
this.Unloaded += delegate
{
_kinectSensor.ColorStream.Disable();
_kinectSensor.DepthStream.Disable();
};
this.Loaded += delegate
{
_kinectSensor = KinectSensor.KinectSensors[0];
_kinectSensor.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30);
_kinectSensor.DepthStream.Enable(DepthImageFormat.Resolution320x240Fps30);
_kinectSensor.ColorFrameReady += ColorFrameReady;
_kinectSensor.DepthFrameReady += DepthFrameReady;
};
_kinectSensor.Start();
}
void DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
}
void ColorFrameReady(object sender, ColorImageFrameReadyEventArgs e)
{
}
CreateaClassandSomeExtensionMethods
AddanewclasstotheprojectcalledImageExtensions.cstocontaintheextensionmethods.Remember
thatwhiletheactualnameoftheclassisunimportant,thenamespacedoesgetused.InListing8-4,Iuse
thenamespaceImageManipulationExtensionMethods.Also,youwillneedtoaddareferenceto
System.Drawing.dll.Asmentionedpreviously,boththeSystem.Drawingnamespaceandthe
System.Windows.Medianamespacesharesimilarlynamedobjects.Inordertopreventnamespace
collisions,forinstancewiththePixelFormatclasses,wemustselectoneofthemtobeprimaryinour
namespacedeclarations.Inthecodebelow,IuseSystem.Drawingasthedefaultnamespaceandcreate
analiasfortheSystem.Windows.MedianamespaceabbreviatedtoMedia.Finally,createextension
methodsforthetwomostimportantimagetransformations:forturningabytearrayintoaBitmapobject
andforturningabytearrayintoaBitmapSourceobject.Thesetwoextensionswillbeusedonthebytesof
acolorimage.Createtwomoreextensionmethodsfortransformingdepthimagesbyreplacingthebyte
260
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
arraysinthesemethodsignatureswithshortarrayssincedepthimagescomeacrossasarraysofthe
shorttyperatherthanbytes.
Listing8-4.ImageManipulationExtensionMethods
using
using
using
using
using
using
using
using
System;
System.Drawing;
Microsoft.Kinect;
System.Drawing.Imaging;
System.Runtime.InteropServices;
System.Windows;
System.IO;
Media = System.Windows.Media;
namespace ImageManipulationExtensionMethods
{
public static class ImageExtensions
{
public static Bitmap ToBitmap(this byte[] data, int width, int height
, PixelFormat format)
{
var bitmap = new Bitmap(width, height, format);
var bitmapData = bitmap.LockBits(
new System.Drawing.Rectangle(0, 0, bitmap.Width, bitmap.Height),
ImageLockMode.WriteOnly,
bitmap.PixelFormat);
Marshal.Copy(data, 0, bitmapData.Scan0, data.Length);
bitmap.UnlockBits(bitmapData);
return bitmap;
}
public static Bitmap ToBitmap(this short[] data, int width, int height
, PixelFormat format)
{
var bitmap = new Bitmap(width, height, format);
}
var bitmapData = bitmap.LockBits(
new System.Drawing.Rectangle(0, 0, bitmap.Width, bitmap.Height),
ImageLockMode.WriteOnly,
bitmap.PixelFormat);
Marshal.Copy(data, 0, bitmapData.Scan0, data.Length);
bitmap.UnlockBits(bitmapData);
return bitmap;
public static Media.Imaging.BitmapSource ToBitmapSource(this byte[] data
, Media.PixelFormat format, int width, int height)
{
return Media.Imaging.BitmapSource.Create(width, height, 96, 96
, format, null, data, width * format.BitsPerPixel / 8);
}
261
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
}
public static Media.Imaging.BitmapSource ToBitmapSource(this short[] data
, Media.PixelFormat format, int width, int height)
{
return Media.Imaging.BitmapSource.Create(width, height, 96, 96
, format, null, data, width * format.BitsPerPixel / 8);
}
}
Theimplementationsabovearesomewhatarcaneandnotnecessarilyworthgoingintohere.What
isimportantisthat,basedonthesetwomethods,wecangetcreativeandwriteadditionalhelper
extensionmethodsthatdecreasethenumberofparametersthatneedtobepassed.
CreateAdditionalExtensionMethods
Sincethebytearraysforboththecoloranddepthimagestreamsareaccessiblefromthe
ColorImageFrame and DepthImageFrametypes,wecanalsocreateadditionalextensionmethods(as
showninListing8-5),whichhangoffofthesetypesratherthanoffofbytearrays.
IntakingbitarraydataandtransformingitintoeitheraBitmaporaBitmapSourcetype,themost
importantfactortotakeintoconsiderationisthepixelformat.Thevideostreamreturnsaseriesof32-bit
RGBimages.Thedepthstreamreturnsaseriesof16-bitRGBimages.Inthecodebelow,Iuse32-bit
imageswithouttransparenciesasthedefault.Inotherwords,videostreamimagescanalwayssimply
callToBitmaporToBitmapSource.Otherformatsareprovidedforbyhavingmethodnamesthathintatthe
pixelformatbeingused.
Listing8-5.AdditionalImageManipulationHelperMethods
// bitmap methods
public static Bitmap ToBitmap(this ColorImageFrame image, PixelFormat format)
{
if (image == null || image.PixelDataLength == 0)
return null;
var data = new byte[image.PixelDataLength];
image.CopyPixelDataTo(data);
return data.ToBitmap(image.Width, image.Height
, format);
}
public static Bitmap ToBitmap(this DepthImageFrame image, PixelFormat format)
{
if (image == null || image.PixelDataLength == 0)
return null;
var data = new short[image.PixelDataLength];
image.CopyPixelDataTo(data);
return data.ToBitmap(image.Width, image.Height
, format);
}
public static Bitmap ToBitmap(this ColorImageFrame image)
262
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
{
return image.ToBitmap(PixelFormat.Format32bppRgb);
}
public static Bitmap ToBitmap(this DepthImageFrame image)
{
return image.ToBitmap(PixelFormat.Format16bppRgb565);
}
// bitmapsource methods
public static Media.Imaging.BitmapSource ToBitmapSource(this ColorImageFrame image)
{
if (image == null || image.PixelDataLength == 0)
return null;
var data = new byte[image.PixelDataLength];
image.CopyPixelDataTo(data);
return data.ToBitmapSource(Media.PixelFormats.Bgr32, image.Width, image.Height);
}
public static Media.Imaging.BitmapSource ToBitmapSource(this DepthImageFrame image)
{
if (image == null || image.PixelDataLength == 0)
return null;
var data = new short[image.PixelDataLength];
image.CopyPixelDataTo(data);
return data.ToBitmapSource(Media.PixelFormats.Bgr555, image.Width, image.Height);
}
public static Media.Imaging.BitmapSource ToTransparentBitmapSource(this byte[] data
, int width, int height)
{
return data.ToBitmapSource(Media.PixelFormats.Bgra32, width, height);
}
YouwillnoticethatthreedifferentpixelformatsshowupintheListing8-5extensionmethods.To
complicatethingsjustalittle,twodifferentenumerationtypesfromtwodifferentlibrariesareusedto
specifythepixelformat,thoughthisisfairlyeasytofigureout.TheBgr32formatissimplya32-bitcolor
imagewiththreecolorchannels.Bgra32isalso32-bit,butusesafourthchannel,calledthealphachannel,fortransparencies.Finally,Bgr555isaformatfor16-bitimages.Recallfromtheprevious
chaptersondepthprocessingthateachpixelinthedepthimageisrepresentedbytwobytes.Thedigits
555indicatethattheblue,green,andredchannelsuseupfivebitseach.Fordepthprocessing,youcould
equallywellusetheBgr565pixelformat,whichuses6bitsforthegreenchannel.Ifyoulike,youcanadd
additionalextensionmethods.Forinstance,IhavechosentohaveToTransparentBitmapSourcehangoff
ofabitarrayonlyandnotoffofacolorbytearray.PerhapsoneoffoftheColorImageFramewouldbe
useful,though.Youmightalsodecideinyourownimplementationsthatusing32-bitimagesasan
implicitdefaultissimplyconfusingandthateveryconversionhelpershouldspecifytheformatbeing
converted.Thepointofprogrammingconventions,afterall,isthattheyshouldmakesensetoyouandto
thosewithwhomyouaresharingyourcode.
263
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
InvoketheExtensionMethods
InordertousetheseextensionmethodsintheMainWindowcode-behind,allyouarerequiredtodoisto
addtheImageManipulationExtensionMethodsnamespacetoyourMainWindownamespacedeclarations.
Younowhaveallthecodenecessarytoconciselytransformthevideoanddepthstreamsintotypesthat
canbeattachedtotheimageobjectsintheMainWindow.xamlUI,asdemonstratedinListing8-6.
Listing8-6.UsingImageManipulationExtensionMethods
void DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
this.depthImage.Source = e.OpenDepthImageFrame().ToBitmap().ToBitmapSource();
}
void ColorFrameReady(object sender, ColorImageFrameReadyEventArgs e)
{
this.rgbImage.Source = e.OpenColorImageFrame().ToBitmapSource();
}
WriteConversionMethods
ThereisafinalsetofconversionsthatIsaidwewouldeventuallywant.Itisusefultobeabletoconvert
System.Windows.Media.Imaging.BitmapSourceobjectsintoSystem.Drawing.Bitmapobjectsandviceversa.
Listing8-7illustrateshowtowritetheseconversionextensionmethods.Oncethesemethodsareadded
toyourarsenalofusefulhelpers,youcantestthemoutby,forinstance,settingthedepthImage.Sourceto
e.Image.Frame.Image.ToBitmapSource().ToBitmap().ToBitmapSource().Surprisingly,thiscodeworks.
Listing8-7.ConvertingBetweenBitmapSourceandBitmapTypes
[DllImport("gdi32")]
private static extern int DeleteObject(IntPtr o);
public static Media.Imaging.BitmapSource ToBitmapSource(this Bitmap bitmap)
{
if (bitmap == null) return null;
IntPtr ptr = bitmap.GetHbitmap();
var source = System.Windows.Interop.Imaging.CreateBitmapSourceFromHBitmap(
ptr,
IntPtr.Zero,
Int32Rect.Empty,
Media.Imaging.BitmapSizeOptions.FromEmptyOptions());
DeleteObject(ptr);
return source;
}
public static Bitmap ToBitmap(this Media.Imaging.BitmapSource source)
{
Bitmap bitmap;
using (MemoryStream outStream = new MemoryStream())
{
264
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
var enc = new Media.Imaging.PngBitmapEncoder();
enc.Frames.Add(Media.Imaging.BitmapFrame.Create(source));
enc.Save(outStream);
bitmap = new Bitmap(outStream);
}
}
return bitmap;
TheDeleteObjectmethodinListing8-7issomethingcalledaPInvokecall,whichallowsustousea
methodbuiltintotheoperatingsystemformemorymanagement.WeuseitintheToBitmapSource
methodtoensurethatwearenotcreatinganunfortunatememoryleak.
ProximityDetection
ThankstothesuccessofKinectontheXbox,itistemptingtothinkofKinectapplicationsascomplete
experiences.Kinectcanalsobeused,however,tosimplyaugmentstandardapplicationsthatusethe
mouse,keyboard,ortouchasprimaryinputmodes.Forinstance,onecouldusetheKinectmicrophone
arraywithoutanyofitsvisualcapabilitiesasanalternativespeechinputdeviceforproductivityor
communicationapplicationswhereKinectisonlyoneofseveraloptionsforreceivingmicrophoneinput.
Alternatively,onecoulduseKinect’svisualanalysismerelytorecognizethatsomethinghappened
visuallyratherthantrytodoanythingwiththevisual,depth,orskeletondata.
Inthissection,wewillexploreusingtheKinectdeviceasaproximitysensor.Forthispurpose,all
thatwearelookingforiswhethersomethinghasoccurredornot.IsapersonstandinginfrontofKinect?
IssomethingthatisnotapersonmovinginfrontofKinect?Whenthetriggerwespecifyreachesacertain
threshold,wethenstartanotherprocess.Atriggerlikethiscouldbeusedtoturnonthelightsinaroom
whensomeonewalksintoit.Forcommercialadvertisingapplications,akioskcangointoanattract
modewhennooneisinrange,butthenbeginmoresophisticatedinteractionwhenapersoncomes
close.Insteadofmerelywritinginteractiveapplications,itispossibletowriteapplicationsthatareaware
oftheirsurroundings.
Kinectcanevenbeturnedintoasecuritycamerathatsavesresourcesbyrecordingvideoonlywhen
somethingsignificanthappensinfrontofit.AtnightIleavefoodoutforouroutdoorcatthatlivesonthe
backporch.RecentlyIhavebeguntosuspectthatothercrittersarestealingmycat’sfood.Byusing
KinectasacombinationmotiondetectorandvideocamerathatIleaveoutovernight,Icanfindout
whatisreallyhappening.Ifyouenjoynatureshows,youknowasimilarsetupcouldrealisticallybeused
overalongerperiodoftimetocapturetheappearanceofrareanimals.Throughconservingharddrive
spacebyturningthevideocameraononlywhenanimalsarenear,thesetupcanbeleftoutforweeksat
atimeprovidedthereisawaytopowerit.If,likeme,yousometimesprefermorefancifulentertainment
thanwhatisprovidedonnatureshows,youcouldevenscareyourselfbysettingupKinecttorecord
videoandsoundinahauntedhousewheneverthewindblowsacurtainaside.ThinkingofKinectasan
augmentationto,ratherthanasthemaininputfor,anapplication,manynewpossibilitiesforusing
Kinectopenup.
SimpleProximityDetection
Asaproofofconcept,wewillbuildaproximitydetectorthatturnsthevideofeedonandoffdepending
onwhethersomeoneisstandinginfrontofKinect.Naturally,thiscouldbeconvertedtoperforma
varietyofothertaskswhensomeoneisinKinect’svisualrange.Theeasiestwaytobuildaproximity
detectoristousetheskeletondetectionbuiltintotheKinectSDK.
BeginbycreatinganewWPFprojectcalledProximityDetector.Addareferenceto
Microsoft.Kinect.dllaswellasareferencetoSystem.Drawing.CopytheImageExtensions.csclassfile
265
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
wecreatedintheprevioussectionintothisprojectandaddtheImageManipulationExtensionMethods
namespacedeclarationtothetopoftheMainWindow.cscode-behind.AsshowninListing8-8,theXAML
forthisapplicationisverysimple.WejustneedanimagecalledrgbImagethatwecanpopulatewithdata
fromtheKinectvideostream.
Listing8-8.ProximityDetectorUI
<Grid >
<Image
</Grid>
Name="rgbImage" Stretch="Fill"/>
Listing8-9showssomeoftheinitializationcode.Forthemostpart,thisisstandardcodeforfeeding
thevideostreamtotheimagecontrol.IntheMainWindowconstructorweinitializetheNui.Runtime
object,turningonboththevideostreamandtheskeletontracker.Wecreateaneventhandlerforthe
videostreamandopenthevideostream.Youhaveseensimilarcodemanytimesbefore.Whatyoumay
nothaveseenbefore,however,istheinclusionofaBooleanflagcalled_isTrackingthatisusedto
indicatewhetherourproximitydetectionalgorithmhasdiscoveredanyoneinthevicinity.Ifithas,the
videoimageisupdatedfromthevideostream.Ifnot,webypassthevideostreamandassignnulltothe
Sourcepropertyofourimagecontrol.
Listing8-9.BaselineProximityDetectionCode
Microsoft.KinectSensor _kinectSensor;
bool _isTracking = false;
// . . .
public MainWindow()
{
InitializeComponent();
this.Unloaded += delegate{
_kinectSensor.ColorStream.Disable();
_kinectSensor.SkeletonStream.Disable();
};
this.Loaded += delegate
{
_kinectSensor = Microsoft.Kinect.KinectSensor.KinectSensors[0];
_kinectSensor.ColorFrameReady += ColorFrameReady;
_kinectSensor.ColorStream.Enable();
// . . .
_kinectSensor.Start();
};
// . . .
}
void ColorFrameReady(object sender, ColorImageFrameReadyEventArgs e)
{
if (_isTracking)
266
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
{
using (var frame = e.OpenColorImageFrame())
{if (frame != null)
rgbImage.Source = frame.ToBitmapSource();};
}
else
rgbImage.Source = null;
}
private void OnDetection()
{
if (!_isTracking)
_isTracking = true;
}
private void OnDetectionStopped()
{
_isTracking = false;
}
Inordertotogglethe_isTrackingflagon,wewillhandletheKinectSensor.SkeletonFrameReady
event.TheSkeletonFrameReadyeventisbasicallysomethinglikeaheartbeat.Aslongasthereareobjects
infrontofthecamera,theSkeletonFrameReadyeventwillkeepgettinginvoked.Inourowncode,allwe
needtodototakeadvantageofthisheartbeateffectistochecktheskeletondataarraypassedtothe
SkeletonFrameReadyeventhandlerandverifythatatleastoneoftheitemsinthearrayisrecognizedand
beingtrackedasarealperson.ThecodeforthisisshowninListing8-10.
Thetrickypartofthisheartbeatmetaphoristhat,likeaheartbeat,sometimestheeventdoesnotget
thrown.Consequently,whilewealwayshaveabuilt-inmechanismtonotifyuswhenabodyhasbeen
detectedinfrontofthecamera,wedonothaveonetotelluswhenitisnolongerdetected.Inorderto
workaroundthis,westartatimerwheneverapersonhasbeendetected.Allthetimerdoesischeckto
seehowlongithasbeensincethelastheartbeatwasfired.Ifthetimegapisgreaterthanacertain
threshold,weknowthattherehasnotbeenaheartbeatforawhileandthatweshouldendthecurrent
proximitysessionsince,figurativelyspeaking,Elvishasleftthebuilding.
Listing8-10.CompletedProximityDetectionCode
// . . .
int _threshold = 100;
DateTime _lastSkeletonTrackTime;
DispatcherTimer _timer = new DispatcherTimer();
public MainWindow()
{
InitializeComponent();
// . . .
this.Loaded += delegate
{
267
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
_kinectSensor = Microsoft.Kinect.KinectSensor.KinectSensors[0];
// . . .
_kinectSensor.SkeletonFrameReady += Pulse;
_kinectSensor.SkeletonStream.Enable();
_timer.Interval = new TimeSpan(0, 0, 1);
_timer.Tick += new EventHandler(_timer_Tick);
_kinectSensor.Start();
}
};
void _timer_Tick(object sender, EventArgs e)
{
if (DateTime.Now.Subtract(_lastSkeletonTrackTime).TotalMilliseconds > _threshold)
{
_timer.Stop();
OnDetectionStopped();
}
}
private void Pulse(object sender, SkeletonFrameReadyEventArgs e)
{
using (var skeletonFrame = e.OpenSkeletonFrame())
{
if (skeletonFrame == null || skeletonFrame.SkeletonArrayLength == 0)
return;
Skeleton[] skeletons = new Skeleton[skeletonFrame.SkeletonArrayLength];
skeletonFrame.CopySkeletonDataTo(skeletons);
for (int s = 0; s < skeletons.Length; s++)
{
if (skeletons[s].TrackingState == SkeletonTrackingState.Tracked)
{
OnDetection();
_lastSkeletonTrackTime = DateTime.Now;
}
}
}
if (!_timer.IsEnabled)
{
_timer.Start();
}
break;
}
268
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
ProximityDetectionwithDepthData
Thiscodeisjustthethingforthetypeofkioskapplicationwediscussedabove.Usingskeletontracking
asthebasisforproximitydetection,akioskwillgointostandbymodewhenthereisnoonetointeract
withandsimplyplaysomesortofvideoinstead.Unfortunately,skeletontrackingwillnotworksowell
forcatchingfood-stealingraccoonsonmybackporchorforcapturingimagesofSasquatchinthe
wilderness.Thisisbecausetheskeletontrackingalgorithmsarekeyedforhumansandacertainsetof
bodytypes.Outsideofthisrangeofhumanbodytypes,objectsinfrontofthecamerawilleithernotbe
trackedor,worse,trackedinconsistently.
Togetaroundthis,wecanusetheKinectdepthdata,ratherthanskeletontracking,asthebasisfor
proximitydetection.AsshowninListing8-11,theruntimemustfirstbeconfiguredtocapturethecolor
anddepthstreamsratherthancolorandskeletaltracking.
Listing8-11.ProximityDetectionConfigurationUsingtheDepthStream
_kinectSensor.ColorFrameReady += ColorFrameReady;
_kinectSensor.DepthFrameReady += DepthFrameReady;
_kinectSensor.ColorStream.Enable();
_kinectSensor.DepthStream.Enable();
Thereareseveraladvantagestousingdepthdataratherthanskeletontrackingasthebasisofa
proximitydetectionalgorithm.First,theheartbeatprovidedbythedepthstreamiscontinuousaslongas
theKinectsensorisrunning.Thisobviatesthenecessityofsettingupaseparatetimertomonitor
whethersomethinghasstoppedbeingdetected.Second,wecansetupaminimumandmaximum
thresholdwithinwhichwearelookingforobjects.Ifanobjectisclosertothedepthcamerathana
minimumthresholdorfartherawayfromthecamerathanamaximumthreshold,wetogglethe
_isTrackingflagoff.TheproximitydetectioncodeinListing8-12detectsanyobjectbetween1000and
1200millimetersfromthedepthcamera.Itdoesthisbyanalyzingeachpixelofthedepthstreamimage
anddeterminingifanypixelfallswithinthedetectionrange.Ifitfindsapixelthatfallswithinthisrange,
itstopsanalyzingtheimageandsets_isTrackingtotrue.Theseparatecodeforhandlingthe
VideoFrameReadyeventpicksuponthefactthatsomethinghasbeendetectedandbeginsupdatingthe
imagecontrolwithvideostreamdata.
269
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Listing8-12.ProximityDetectionAlgorithmUsingtheDepthStream
void DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
bool isInRange = false;
using (var imageData = e.OpenDepthImageFrame())
{
if (imageData == null || imageData.PixelDataLength == 0)
return;
short[] bits = new short[imageData.PixelDataLength];
imageData.CopyPixelDataTo(bits);
int minThreshold = 1000;
int maxThreshold = 1200;
for (int i = 0; i < bits.Length; i += imageData.BytesPerPixel)
{
var depth = bits[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
if (depth > minThreshold && depth < maxThreshold)
{
isInRange = true;
OnDetection();
break;
}
}
}
if(!isInRange)
OnDetectionStopped();
}
Afinaladvantageofusingdepthdataratherthanskeletaltrackingdataforproximitydetectionis
thatitismuchfaster.Eventhoughskeletaltrackingoccursatamuchlowerlevelthanouranalysisofthe
depthstreamdata,itrequiresthatafullhumanbodyisinthecamera’sfieldofvision.Additionaltimeis
requiredtoanalyzetheentirehumanbodyimagewiththedecisiontreesbuiltintotheKinectSDKand
verifythatitfallswithincertainparameterssetupforskeletalrecognition.Withthisdepthimage
algorithm,wearesimplylookingforonepixelwithinagivenrangeratherthanidentifytheentirehuman
outline.Unliketheskeletaltrackingalgorithmweusedpreviously,thedepthalgorithminListing8-12
willtriggertheOnDetectionmethodassoonassomethingiswithinrangeevenattheveryedgeofthe
depthcamera’sfieldofvision.
RefiningProximityDetection
Therearealsoshortcomingstousingthedepthdata,ofcourse.Theareabetweentheminimumand
maximumdepthrangemustbekeptclearinordertoavoidhaving_isTrackingalwayssettotrue.While
depthtrackingallowsustorelaxtheconditionsthatsetofftheproximitydetectionbeyondhuman
beings,itmayrelaxitabittoomuchsincenoweveninanimateobjectscantriggertheproximity
detector.Beforemovingontoimplementingamotiondetectortosolvethisproblemofhavinga
270
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
proximitydetectorthatiseithertoostrictortooloose,Iwanttointroduceathirdpossibilityforthesake
ofcompleteness.
Listing8-13demonstrateshowtoimplementaproximitydetectorthatcombinesbothplayerdata
anddepthdata.Thisisagoodchoiceiftheskeletontrackingalgorithmfitsyourneedsbutyouwould
liketoconstrainitfurtherbyonlydetectinghumanshapesbetweenaminimumandamaximum
distancefromthedepthcamera.Thiscouldbeuseful,again,forakiosktypeapplicationsetupinan
openarea.Onesetofinteractionscanbetriggeredwhenapersonenterstheviewableareainfrontof
Kinect.AnothersetofinteractionscanbetriggeredwhenapersoniswithinameterandahalfofKinect,
andthenathirdsetofinteractionscanoccurwhenthepersoniscloseenoughtotouchthekioskitself.
Tosetupthissortofproximitydetection,youwillwanttoreconfiguretheKinectSensorinthe
MainWindowconstructorbyenablingskeletondetectioninordertousedepth as well as playerdata
ratherthanDepthdataalone.Oncethisisdone,theeventhandlerfortheDepthFrameReadycanbe
rewrittentocheckfordepththresholdsaswellasthepresenceofahumanshape.Alltheremainingcode
canstaythesame.
Listing8-13.ProximityDetectionAlgorithmUsingtheDepthStreamandPlayerIndex
void DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
bool isInRange = false;
using (var imageData = e.OpenDepthImageFrame())
{
if (imageData == null || imageData.PixelDataLength == 0)
return;
short[] bits = new short[imageData.PixelDataLength];
imageData.CopyPixelDataTo(bits);
int minThreshold = 1700;
int maxThreshold = 2000;
for (int i = 0; i < bits.Length; i += imageData.BytesPerPixel)
{
var depth = bits[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
var player = bits[i] & DepthImageFrame.PlayerIndexBitmask;
if (player > 0 && depth > minThreshold && depth < maxThreshold)
{
isInRange = true;
OnDetection();
break;
}
}
}
}
if(!isInRange)
OnDetectionStopped();
271
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
DetectingMotion
Motiondetectionisbyfarthemostinterestingwaytoimplementproximitydetection.Thebasicstrategy
forimplementingmotiondetectionistostartwithaninitialbaselineRGBimage.Aseachimageis
receivedfromthevideostream,itcanbecomparedagainstthebaselineimage.Ifdifferencesare
detected,wecanassumethatsomethinghasmovedinthefieldofviewoftheRGBcamera.
Youhavenodoubtalreadyfoundthecentralflawinthisstrategy.Intherealworld,objectsget
moved.Inaroom,someonemightmovethefurniturearoundslightly.Outdoors,acarmightbemoved
orthewindmightshifttheangleofasmalltree.Ineachofthesecases,sincetherehasbeenachange
eventhoughthereisnocontinuousmotion,thesystemwilldetectafalsepositiveandwillindicate
motionwherethereisnone.Inthesecases,whatwewouldliketobeabletodoistochangethebaseline
imageintermittently.
Accomplishingsomethinglikethisrequiresmoreadvancedimageanalysisandprocessingthanwe
haveencounteredsofar.FortunatelyanopensourceprojectknownasOpenCV(OpenComputerVision)
providesalibraryforperformingthesesortsofcomplexreal-timeimageprocessingoperations.Intel
ResearchinitiatedOpenCVin1999toprovidetheresultsofadvancedvisionresearchtotheworld.In
2008,theprojectwasupdatedbyandiscurrentlysupportedthroughWillowGarage,atechnology
incubationcompany.Aroundthesametime,aprojectcalledEmguCVwasstarted,whichprovidesa
.NETwrapperforOpenCV.WewillbeusingEmguCVtoimplementmotiondetectionandalsofor
severalsubsequentsampleprojects.
TheofficialEmguCVsiteisatwww.emgu.com.Theactualcodeandinstallationpackagesarehosted
onSourceForgeathttp://sourceforge.net/projects/emgucv/files/.IntheKinectSDKprojects
discussedinthisbookweusethe2.3.0versionofEmguCV.Actualinstallationisfairlystraightforward.
Simplyfindtheexecutablesuitableforyourwindowsoperatingsystemandrunit.Thereisonecaveat,
however.EmguCVseemstorunbestusingthex86architecture.Ifyouaredevelopingona64-bit
machine,youarebestoffexplicitlysettingyourplatformtargetforprojectsusingtheEmgulibraryto
x86,asillustratedinFigure8-1.(YoucanalsopulldowntheEmgusourcecodeandcompileityourself
forx64,ifyouwish.)TogettothePlatformTargetsetting,selectthepropertiesforyourprojecteitherby
rightclickingonyourprojectintheVisualStudioSolutionspaneorbyselectingProject|Propertieson
themenubaratthetopoftheVisualStudioIDE.ThenselecttheBuildtab,whichshouldbethesecond
tabavailable.
272
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Figure8-1.Settingtheplatformtarget
InordertoworkwiththeEmgulibrary,youwillgenerallyneedtoaddreferencestothreedlls:
Emgu.CV,Emgu.CV.UI,andEmgu.Util.ThesewilltypicallybefoundintheEmguinstallfolder.Onmy
computer,theyarefoundatC:\Emgu\emgucv-windows-x86 2.3.0.1416\bin\.
Thereisanadditionalratherconfusing,andadmittedlyrathermessy,step.BecauseEmguisa
wrapperofC++libraries,youwillalsoneedtoplaceseveraladditionalunmanageddllsinalocation
wheretheEmguwrappersexpectstofindthem.Emgulooksforthesefilesintheexecutabledirectory.If
youarecompilingadebugproject,thiswouldbethebin/Debugfolder.Forreleasecompilation,this
wouldbethebin/Releasesubdirectoryofyourproject.Elevenfilesneedtobecopiedintoyour
executabledirectory:opencv_calib3d231.dll,opencv_conrib231.dll,opencv_core231.dll,
opencv_features2d231.dll,opencv_ffmpeg.dll,opencv_highgui231.dll,opencv_imgproc231.dll,
opencv_legacy231.dll,opencv_ml231.dll,opencv_objectdetect231.dll,andopencv_video231.dll.These
canbefoundinthebinsubdirectoryoftheEmguinstallation.Forconvenience,youcanalsosimply
copyoveranydllinthatfolderthatbeginswith“opencv_*”.
Asmentionedearlier,unlockingthefullpotentialoftheKinectSDKbycombiningitwithadditional
toolscansometimesgetmessy.ByaddingtheimageprocessingcapabilitiesofOpenCVandEmgu,
however,webegintohavesomeverypowerfultoystoplaywith.Forinstance,wecanbegin
implementingatruemotiontrackingsolution.
Weneedtoaddafewmorehelperextensionmethodstoourtoolboxfirst,though.Asmentioned
earlier,eachlibraryhasitsowncoreimagetypethatitunderstands.InthecaseofEmgu,thistypeisthe
genericImage<TColor, TDepth>type,whichimplementstheEmgu.CV.IImageinterface.Listing8-14shows
someextensionmethodsforconvertingbetweentheimagetypeswearealreadyfamiliarwithandthe
Emguspecificimagetype.CreateanewstaticclassforyourprojectcalledEmguImageExtensions.cs.Give
itanamespaceofImageManipulationExtensionMethods.Byusingthesamenamespaceasourearlier
273
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
ImageExtensionsclass,wecanmakealloftheextensionmethodswehavewrittenavailabletoafilewith
onlyonenamespacedeclaration.Thisclasswillhavethreeconversions:fromMicrosoft.
Kinect.ColorFrameImagetoEmgu.CV.Image<TColor, TDepth>,fromSystem.Drawing.Bitmapto
Emgu.CV.Image<TColor, TDepth>,andfinallyfromEmgu.CV.Image<TColor, TDepth>to
System.Windows.Media.Imaging.BitmapSource.
Listing8-14.EmguExtensionMethods
namespace ImageManipulationExtensionMethods
{
public static class EmguImageExtensions
{
public static Image<TColor, TDepth> ToOpenCVImage<TColor, TDepth>(
this ColorImageFrame image)
where TColor : struct, IColor
where TDepth : new()
{
var bitmap = image.ToBitmap();
return new Image<TColor, TDepth>(bitmap);
}
public static Image<TColor, TDepth> ToOpenCVImage<TColor, TDepth>(
this Bitmap bitmap)
where TColor : struct, IColor
where TDepth : new()
{
return new Image<TColor, TDepth>(bitmap);
}
}
}
public static System.Windows.Media.Imaging.BitmapSource ToBitmapSource(
this IImage image)
{
var source = image.Bitmap.ToBitmapSource();
return source;
}
InimplementingmotiondetectionwiththeEmgulibrary,wewillusethepollingtechnique
introducedinearlierchaptersratherthaneventing.Becauseimageprocessingcanberesourceintensive,
wewanttothrottlehowoftenweperformit,whichisreallyonlypossiblebyusingpolling.Itshouldbe
pointedoutthatthisisonlyaproofofconceptapplication.Thiscodehasbeenwrittenchieflywithagoal
ofreadability—inparticularprintedreadability—ratherthanperformance.
Becausethevideostreamisalreadybeingusedtoupdatetheimagecontrol,wewillusethedepth
streamtoperformmotiontracking.Thepremiseisthatallthedataweneedformotiontrackingwillbe
adequatelyprovidedbythedepthstream.Asdiscussedinearlierchapters,the
CompositionTarget.Renderingeventisgenerallyusedtoperformpollingonthevideostream.Forthe
depthstream,however,wewillcreateaBackgroundWorkerobjectforthedepthstream.Asshownin
Listing8-15,thebackgroundworkerwillcallamethodcalledPulsetopollthedepthstreamandperform
someresource-intensiveprocessing.Whenthethreadedbackgroundworkercompletesaniteration,it
willagainpollforanotherdepthimageandperformanotherprocessingoperation.TwoEmguobjects
aredeclaredasmembers:aMotionHistoryobjectandanIBGFGDetectorobject.Thesetwoobjectswillbe
274
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
usedtogethertocreatetheconstantlyupdatingbaselineimagewewillcompareagainsttodetect
motion.
Listing8-15.MotionDetectionConfiguration
KinectSensor _kinectSensor;
private MotionHistory _motionHistory;
private IBGFGDetector<Bgr> _forgroundDetector;
bool _isTracking = false;
public MainWindow()
{
InitializeComponent();
this.Unloaded += delegate
{
_kinectSensor.ColorStream.Disable();
};
this.Loaded += delegate
{
_motionHistory = new MotionHistory(
1.0, //in seconds, the duration of motion history you wants to keep
0.05, //in seconds, parameter for cvCalcMotionGradient
0.5); //in seconds, parameter for cvCalcMotionGradient
_kinectSensor = KinectSensor.KinectSensors[0];
_kinectSensor.ColorStream.Enable();
_kinectSensor.Start();
BackgroundWorker bw = new BackgroundWorker();
bw.DoWork += (a, b) => Pulse();
bw.RunWorkerCompleted += (c, d) => { bw.RunWorkerAsync(); };
bw.RunWorkerAsync();
}
Listing8-16showstheactualcodeusedtoperformimageprocessinginordertodetectmotion.The
codeisamodifiedversionofsamplecodeprovidedwiththeEmguinstall.ThefirsttaskinthePulse
methodistoconverttheColorImageFrameprovidedbythecolorstreamintoanEmguimagetype.The
_forgroundDetectoristhenusedbothtoupdatethe_motionHistoryobject,whichisthecontainerfor
theconstantlyrevisedbaselineimage,aswellastocompareagainstthebaselineimagetoseeifany
changeshaveoccurred.Animageiscreatedtocaptureanydiscrepanciesbetweenthebaselineimage
andthecurrentimagefromthecolorstream.Thisimageisthentransformedintoasequenceofsmaller
imagesthatbreakdownanymotiondetected.Wethenloopthroughthissequenceofmovementimages
toseeiftheyhavesurpassedacertainthresholdofmovementwehaveestablished.Ifthemovementis
substantial,wefinallyshowthevideoimage.Ifnoneofthemovementsaresubstantialorifnoneare
captured,wehidethevideoimage.
275
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Listing8-16.MotionDetectionAlgorithm
private void Pulse()
{
using (ColorImageFrame imageFrame = _kinectSensor.ColorStream.OpenNextFrame(200))
{
if (imageFrame == null)
return;
using (Image<Bgr, byte> image = imageFrame.ToOpenCVImage<Bgr, byte>())
using (MemStorage storage = new MemStorage()) //create storage for motion components
{
if (_forgroundDetector == null)
{
_forgroundDetector = new BGStatModel<Bgr>(image
, Emgu.CV.CvEnum.BG_STAT_TYPE.GAUSSIAN_BG_MODEL);
}
_forgroundDetector.Update(image);
//update the motion history
_motionHistory.Update(_forgroundDetector.ForgroundMask);
//get a copy of the motion mask and enhance its color
double[] minValues, maxValues;
System.Drawing.Point[] minLoc, maxLoc;
_motionHistory.Mask.MinMax(out minValues, out maxValues
, out minLoc, out maxLoc);
Image<Gray, Byte> motionMask = _motionHistory.Mask
.Mul(255.0 / maxValues[0]);
//create the motion image
Image<Bgr, Byte> motionImage = new Image<Bgr, byte>(motionMask.Size);
motionImage[0] = motionMask;
//Threshold to define a motion area
//reduce the value to detect smaller motion
double minArea = 100;
storage.Clear(); //clear the storage
Seq<MCvConnectedComp> motionComponents =
_motionHistory.GetMotionComponents(storage);
bool isMotionDetected = false;
//iterate through each of the motion component
for (int c = 0; c < motionComponents.Count(); c++)
{
MCvConnectedComp comp = motionComponents[c];
//reject the components that have small area;
if (comp.area < minArea) continue;
OnDetection();
276
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
isMotionDetected = true;
break;
}
if (isMotionDetected == false)
{
OnDetectionStopped();
this.Dispatcher.Invoke(new Action(() => rgbImage.Source = null));
return;
}
this.Dispatcher.Invoke(
new Action(() => rgbImage.Source = imageFrame.ToBitmapSource())
);
}
}
}
SavingtheVideo
Itwouldbenicetobeabletocompletethisprojectbyactuallyrecordingavideototheharddrive
insteadofsimplydisplayingthevideofeed.Videorecording,however,isnotoriouslytrickyand,while
youwillfindmanyKinectsamplesontheInternetshowingyouhowtosaveastillimagetodisk,veryfew
demonstratehowtosaveacompletevideotodisk.Fortunately,EmguprovidesaVideoWritertypethat
allowsustodojustthat.
Listing8-17illustrateshowtoimplementaRecordandaStopRecordingmethodinordertowrite
imagesstreamedfromtheKinectRGBcameratoanAVIfile.ForthiscodeIhavecreatedafoldercalled
vidsonmyDdrive.Tobewrittento,thisdirectorymustexist.Whenrecordingstarts,wecreateafile
namebasedonthetimeatwhichtherecordingbegins.Wealsobeginaggregatingtheimagesfromthe
videostreamintoagenericlistofimages.Whenstoprecordingiscalled,thislistofEmguimagesis
passedtotheVideoWriterobjectinordertowritetodisk.Thisparticularcodedoesnotuseanencoder
andconsequentlycreatesverylargeAVIfiles.YoucanopttoencodetheAVIfiletocompressthevideo
writtentodisk,thoughthetradeoffisthisprocessismuchmoreprocessorintensive.
277
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Listing8-17.RecordingVideo
bool _isRecording = false;
string _baseDirectory = @"d:\vids\";
string _fileName;
List<Image<Rgb,Byte>> _videoArray = new List<Image<Rgb,Byte>>();
void Record(ColorImageFrame image)
{
if (!_isRecording)
{
_fileName = string.Format("{0}{1}{2}", _baseDirectory
, DateTime.Now.ToString("MMddyyyyHmmss"), ".avi");
_isRecording = true;
}
_videoArray.Add(image.ToOpenCVImage<Rgb,Byte>());
}
void StopRecording()
{
if (!_isRecording)
return;
using (VideoWriter vw = new VideoWriter(_fileName, 0, 30, 640, 480, true))
{
for (int i = 0; i < _videoArray.Count(); i++)
vw.WriteFrame<Rgb, Byte>(_videoArray[i]);
}
_fileName = string.Empty;
_videoArray.Clear();
_isRecording = false;
}
ThefinalpieceofthismotiondetectionvideocameraissimplytomodifytheRGBpollingcodeto
notonlystreamimagestotheimagecontrollerinourUIbutalsotocalltheRecordmethodwhenmotion
isdetectedandtocalltheStopRecordingmethodwhennomotionisdetected,asshowninListing8-18.
Thiswillprovideyouwithafullyworkingsophisticatedprototypethatanalyzesrawstreamdatato
detectanychangesintheviewableareainfrontofKinectandalsodoessomethingusefulwiththat
information.
278
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Listing8-18.CallingtheRecordandStopRecordingMethods
if (isMotionDetected == false)
{
OnDetectionStopped();
this.Dispatcher.Invoke(new Action(() => rgbImage.Source = null));
StopRecording();
return;
}
this.Dispatcher.Invoke(
new Action(() => rgbImage.Source = imageFrame.ToBitmapSource())
);
Record(imageFrame);
IdentifyingFaces
TheEmguCVlibrarycanalsobeusedtodetectfaces.Whileactualfacialrecognition—identifyinga
personbasedonanimageofhim—istoocomplextobeconsideredhere,processinganimageinorder
tofindportionsofitthatcontainfacesisanintegralfirststepinachievingfullfacialrecognition
capability.
MostfacialdetectionsoftwareisbuiltaroundsomethingcalledHaar-likefeatures,whichisan
applicationofHaarwavelets,asequenceofmathematicallydefinedsquareshapes.PaulViolaand
MichaelJonesdevelopedtheViola-Jonesobjectdetectionframeworkin2001basedonidentifyingHaarlikefeatures,alesscomputationallyexpensivemethodthanotheravailabletechniquesforperforming
facialdetection.TheirworkwasincorporatedintoOpenCV.
FacialdetectioninOpenCVandEmguCVisbuiltaroundasetofrulesenshrinedinanXMLfile
writtenbyRainerLienhart.Thefileiscalledhaarcascade_frontalface_default.xmlandcanberetrieved
fromtheEmgusamples.Itisalsoincludedinthesamplecodeassociatedwiththischapterandis
coveredundertoOpenCVBSDlicense.Thereisalsoasetofrulesavailableforeyerecognition,whichwe
willnotuseinthecurrentproject.
ToconstructasimplefacedetectionprogramtousewiththeKinectSDK,createanewWPFproject
calledFaceFinder.Addreferencestothefollowingdlls:Microsoft.Kinect,System.Drawing,Emgu.CV,
Emgu.CV.UI,andEmgu.Util.Addtheopencv_*dllstoyourbuildfolder.Finally,addthetwoextension
libraryfileswecreatedearlierinthischaptertotheproject:ImageExtensions.csand
EmguImageExtensions.cs.TheXAMLforthisprojectisassimpleasinpreviousexamples.Justaddan
imagecontroltorootGridinMainWindowandnameitrgbImage.
InstantiateaKinectSensorobjectintheMainWindowconstructorandconfigureittouseonlythe
videostream.SincetheEmguCVlibraryisintendedforimageprocessing,wetypicallyuseitwithRGB
imagesratherthandepthimages.Listing8-19showswhatthissetupcodeshouldlooklike.Wewillusea
BackgroundWorkerobjecttopollthevideostream.Eachtimethebackgroundworkerhascompletedan
iteration,itwillpollthevideostreamagain.
279
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Listing8-19.FaceDetectionSetup
KinectSensor _kinectSensor;
public MainWindow()
{
InitializeComponent();
this.Unloaded += delegate
{
_kinectSensor.ColorStream.Disable();
};
this.Loaded += delegate
{
_kinectSensor = KinectSensor.KinectSensors[0];
_kinectSensor.ColorStream.Enable();
_kinectSensor.Start();
};
BackgroundWorker bw = new BackgroundWorker();
bw.RunWorkerCompleted += (a, b) => bw.RunWorkerAsync();
bw.DoWork += delegate { Pulse(); };
bw.RunWorkerAsync();
}
ThePulsemethod,whichhandlesthebackgroundworker’sDoWorkevent,isthemainworkhorse
here.ThecodeshowninListing8-20ismodifiedfromsamplesprovidedwiththeEmguinstall.We
instantiateanewHaarCascadeinstancebasedontheprovidedfacedetectionrulesfile.Next,weretrieve
animagefromthevideostreamandconvertitintoanEmguimagetype.Thisimageisgrayscaledanda
highercontrastisappliedtoittomakefacialdetectioneasier.TheHaardetectionrulesareappliedto
theimageinordertogenerateaseriesofstructuresthatindicatewhereintheimagefaceswerefound.A
bluerectangleisdrawnaroundanydetectedfaces.Thecompositeimageisthenconvertedintoa
BitmapSourcetypeandpassedtotheimagecontrol.BecauseofthewayWPFthreadingworks,wehaveto
usetheDispatcherobjectheretoperformtheassignmentinthecorrectthread.
Listing8-20.FaceDetectionAlgorithm
string faceFileName = "haarcascade_frontalface_default.xml";
public void Pulse()
{
using (HaarCascade face = new HaarCascade(faceFileName))
{
var frame = _kinectSensor.ColorStream.OpenNextFrame(100);
var image = frame.ToOpenCVImage<Rgb, Byte>();
//Convert it to Grayscale
using (Image<Gray, Byte> gray = image.Convert<Gray, Byte>())
{
//normalizes brightness and increases contrast of the image
gray._EqualizeHist();
280
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
MCvAvgComp[] facesDetected = face.Detect(
gray,
1.1,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING,
new System.Drawing.Size(20, 20));
foreach (MCvAvgComp f in facesDetected)
{
image.Draw(f.rect, new Rgb(System.Drawing.Color.Blue), 2);
}
}
Dispatcher.BeginInvoke(new Action(() => {
rgbImage.Source = image.ToBitmapSource();
}));
}
Figure8-2showstheresultsofapplyingthiscode.Theaccuracyoftheblueframearounddetected
facesismuchbetterthanwhatwemightgetbytryingtoperformsimilarlogicusingskeletaltracking.
Figure8-2.Findingfaces
281
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
SincethestructurescontainedinthefacesDetectedclearlyprovidelocationinformation,wecan
alsousethefacedetectionalgorithmtobuildanaugmentedrealityapplication.Thetrickistohavean
imageavailableandthen,insteadofdrawingabluerectangleintothevideostreamimage,drawthe
standbyimageinstead.Listing8-21showsthecodewewouldusetoreplacethebluerectanglecode.
Listing8-21.AugmentedRealityImplementation
Image<Rgb, Byte> laughingMan = new Image<Rgb, byte>("laughing_man.jpg");
foreach (MCvAvgComp f in facesDetected)
{
///image.Draw(f.rect, new Rgb(System.Drawing.Color.Blue), 2);
var rect = new System.Drawing.Rectangle(f.rect.X - f.rect.Width / 2
, f.rect.Y - f.rect.Height / 2
, f.rect.Width * 2
, f.rect.Height * 2);
var newImage = laughingMan.Resize(rect.Width, rect.Height
, Emgu.CV.CvEnum.INTER.CV_INTER_LINEAR);
for (int i = 0; i < (rect.Height); i++)
{
for (int j = 0; j < (rect.Width); j++)
{
if (newImage[i, j].Blue != 0 && newImage[i, j].Red != 0
&& newImage[i, j].Green != 0)
image[i + rect.Y, j + rect.X] = newImage[i, j];
}
}
}
TheresultingeffectshowninFigure8-3isfromananimecalledGhostintheShell:StandAlone
Complexinwhichahackerinthenearfuturehideshimselfwithinapervasivelysurveiledsocietyby
superimposingalaughingmanoverhisownimagewheneverhisfaceiscapturedonvideo.Becausethe
underlyingalgorithmissogood,thelaughingmanimage,muchasitdoesintheanime,scalesasfaces
approachormoveawayfromthecamera.
282
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Figure8-3.Laughingmanaugmentedrealityeffect
Withsomeadditionalwork,thisbasiccodecanbeadaptedtotakeoneperson’sfaceand
superimposeitonsomeoneelse’shead.YoucouldevenuseittopulldatafrommultipleKinectsand
mergefacesandobjectstogether.AllthisrequiresishavingtheappropriateHaarcascadesforthe
objectsyouwanttosuperimpose.
Holograms
AnotherinterestingeffectassociatedwithKinectisthepseudo-hologram.A3Dimagecanbemadetotilt
andshiftbasedonthevariouspositionsofapersonstandinginfrontofKinect.Whendoneright,the
effectcreatestheillusionthatthe3Dimageexistsina3Dspacethatextendsintothedisplaymonitor.
Becauseofthe3DvectorgraphicscapabilitiesofWPF,whatI’vedescribedisactuallyeasytoimplement
usingKinectandWPF.Figure8-4showsasimple3Dcubethatcanbemadetorotateandscale
dependingonanobserver’sposition.Theillusiononlyworkswhenthereisonlyasingleobserver,
however.
283
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Figure8-4.3Dcube
ThiseffectactuallygoesbacktoaWiiRemotehackthatJohnnyChungLeedemonstratedathis2008
TEDtalk.ThisisthesameJohnnyLeewhoworkedontheKinectteamforawhileandalsoinspiredthe
AdaFruitcontesttohacktogetheracommunitydriverfortheKinectsensor.InLee’simplementation,an
infraredsensorfromtheWiiremotewasplacedonapairofglasses,totrackapersonwearingtheglasses
ashemovedaroundtheroom.Thedisplaywouldthenrotateacomplex3Dimagebasedonthe
movementsofthepairofglassestocreatethehologrameffect.
TheKinectSDKimplementationforthisisrelativelysimple.KinectalreadyprovidesX,Y,andZ
coordinatesforaplayerskeleton,representedinmeters.Thedifficultpartiscreatinganinteresting3D
vectorimageinXAML.ForthisprojectIuseatoolcalledBlender,whichisanopensource3Dmodel
creationsuiteavailableatwww.blender.org.Toget3DmeshestoexportasXAML,however,itis
necessarytofindanadd-intoBlenderthatwillallowyoutodoso.TheversionofBlenderIuseis2.6and
whilethereisanexporteravailableforit,itissomewhatlimited.DanLehenbaueralsohasaXAML
exporterforBlenderavailableonCodePlex,butitonlyworksonolderversionsofBlender.Aswithmost
effortstocreateinterestingmashupswiththeKinectSDK,thisisonceagainaninstanceinwhichsome
elbowgreaseandlotsofpatienceisrequired.
Thecentralconceptof3DvectorgraphicsinWPFistheViewport3Dobject.TheViewport3Dcanbe
thoughtofasa3Dspaceintowhichwecandepositobjects,lightsources,andacamera.Tobuildthe3D
effect,createanewWPFprojectinVisualStudiocalledHologramandaddareferencetothe
Microsoft.Kinect dll.IntheMainWindowUI,createanewViewport3DelementnestedintherootGrid.
Listing8-22showswhatthemarkupforthefullydrawncubelookslike.Themarkupisalsoavailablein
thesampleprojectsassociatedwiththischapter.Inthisproject,theonlypartofthiscodethatinteracts
withKinectistheViewport3Dcamera.Consequently,itisveryimportanttonamethecamera.
ThecamerainListing8-22hasapositionexpressedinX,Y,Zcoordinatespace.Xincreasesinvalue
fromlefttoright.Yincreasesfromthebottommovingup.Zincreasesasitleavestheplaneofthescreen
andapproachestheobserver.Thelookdirection,inthiscase,simplyinvertstheposition.Thistellsthe
cameratolookdirectlybacktothe0,0,0coordinate.UpDirection,finally,indicatestheorientationofthe
camera—inthiscase,upisthepositiveYdirection.
284
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Listing8-22.TheCube
<Viewport3D>
<Viewport3D.Camera>
<PerspectiveCamera x:Name="camera" Position="-40,160,100"
LookDirection="40,-160,-100"
UpDirection="0,1,0" />
</Viewport3D.Camera>
<ModelVisual3D >
<ModelVisual3D.Content>
<Model3DGroup>
<DirectionalLight Color="White" Direction="-1,-1,-3" />
<GeometryModel3D >
<GeometryModel3D.Geometry>
<MeshGeometry3D
Positions="1.000000,1.000000,-1.000000 1.000000,-1.000000,-1.000000 -1.000000,-1.000000,
-1.000000 -1.000000,1.000000,-1.000000 1.000000,0.999999,1.000000 -1.000000,1.000000,
1.000000 -1.000000,-1.000000,1.000000 0.999999,-1.000001,1.000000 1.000000,
1.000000,-1.000000 1.000000,0.999999,1.000000 0.999999,-1.000001,1.000000 1.000000,-1.000000,
-1.000000 1.000000,-1.000000,-1.000000 0.999999,-1.000001,1.000000 -1.000000,
-1.000000,1.000000 -1.000000,-1.000000,-1.000000 -1.000000,-1.000000,-1.000000 -1.000000,
-1.000000,1.000000 -1.000000,1.000000,1.000000 -1.000000,1.000000,
-1.000000 1.000000,0.999999,1.000000 1.000000,1.000000,-1.000000 -1.000000,
1.000000,-1.000000 -1.000000,1.000000,1.000000"
TriangleIndices="0,1,3 1,2,3 4,5,7 5,6,7 8,9,11 9,10,11 12,13,15 13,14,15 16,17,
19 17,18,19 20,21,23 21,22,23"
Normals="0.000000,0.000000,-1.000000 0.000000,0.000000,-1.000000 0.000000,0.000000,
-1.000000 0.000000,0.000000,-1.000000 0.000000,-0.000000,1.000000 0.000000,-0.000000,
1.000000 0.000000,-0.000000,1.000000 0.000000,-0.000000,1.000000 1.000000,-0.000000,
0.000000 1.000000,-0.000000,0.000000 1.000000,-0.000000,0.000000 1.000000,-0.000000,
0.000000 -0.000000,-1.000000,-0.000000 -0.000000,-1.000000,-0.000000 -0.000000,
-1.000000,-0.000000 -0.000000,-1.000000,-0.000000 -1.000000,0.000000,-0.000000
-1.000000,0.000000,-0.000000 -1.000000,0.000000,-0.000000 -1.000000,0.000000,
-0.000000 0.000000,1.000000,0.000000 0.000000,1.000000,0.000000 0.000000,1.000000,
0.000000 0.000000,1.000000,0.000000"/>
</GeometryModel3D.Geometry>
<GeometryModel3D.Material>
<DiffuseMaterial Brush="blue"/>
</GeometryModel3D.Material>
</GeometryModel3D>
<Model3DGroup.Transform>
<Transform3DGroup>
<Transform3DGroup.Children>
<TranslateTransform3D OffsetX="0" OffsetY="0"
OffsetZ="0.0935395359992981"/>
<ScaleTransform3D ScaleX="12.5608325004577637"
ScaleY="12.5608322620391846" ScaleZ="12.5608325004577637"/>
</Transform3DGroup.Children>
</Transform3DGroup>
</Model3DGroup.Transform>
285
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
</Model3DGroup>
</ModelVisual3D.Content>
</ModelVisual3D>
</Viewport3D>
Thecubeitselfisdrawnusingaseriesofeightpositions,eachrepresentedbythreecoordinates.
Triangleindicesarethendrawnoverthesepointstoprovideasurfacetothecube.Tothisweadda
Materialobjectandpaintitblue.Wealsoaddascaletransformtothecubetomakeitbigger.Finally,we
addadirectionallighttoimprovethe3Deffectwearetryingtocreate.
Inthecode-behindforMainWindow,weonlyneedtoconfiguretheKinectSensortosupportskeletal
tracking,asshowninListing8-23.Videoanddepthdataareuninterestingtousforthisproject.
Listing8-23.HologramConfiguration
KinectSensor _kinectSensor;
public MainWindow()
{
InitializeComponent();
this.Unloaded += delegate
{
_kinectSensor.DepthStream.Disable();
_kinectSensor.SkeletonStream.Disable();
};
this.Loaded += delegate
{
_kinectSensor = KinectSensor.KinectSensors[0];
_kinectSensor.SkeletonFrameReady += SkeletonFrameReady;
_kinectSensor.DepthFrameReady += DepthFrameReady;
_kinectSensor.SkeletonStream.Enable();
_kinectSensor.DepthStream.Enable();
_kinectSensor.Start();
};
}
Tocreatetheholographiceffect,wewillbemovingthecameraaroundourcuberatherthan
attemptingtorotatethecubeitself.Wemustfirstdetermineifapersonisactuallybeingtrackedby
Kinect.Ifsomeoneis,wesimplyignoreanyadditionalplayersKinectmayhavepickedup.Weselectthe
skeletonwehavefoundandextractitsX,Y,andZcoordinates.EventhoughtheKinectpositiondatais
basedonmeters,our3Dcubeisnot,soitisnecessarytomassagethesepositionsinordertomaintain
the3Dillusion.Basedonthesetweakedpositioncoordinates,wemovethecameraaroundtoroughlybe
inthesamespatiallocationastheplayerKinectistracking,asshowninListing8-24.Wealsotakethese
coordinatesandinvertthemsothecameracontinuestopointtowardthe0,0,0originposition.
286
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Listing8-24.MovingtheCameraBasedOnUserPosition
void SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
float x=0, y=0, z = 0;
//get angle of skeleton
using (var frame = e.OpenSkeletonFrame())
{
if (frame == null || frame.SkeletonArrayLength == 0)
return;
var skeletons = new Skeleton[frame.SkeletonArrayLength];
frame.CopySkeletonDataTo(skeletons);
for (int s = 0; s < skeletons.Length; s++)
{
if (skeletons[s].TrackingState == SkeletonTrackingState.Tracked)
{
border.BorderBrush = new SolidColorBrush(Colors.Red);
var skeleton = skeletons[s];
x = skeleton.Position.X * 60;
z = skeleton.Position.Z * 120;
y = skeleton.Position.Y;
break;
}
else
{
border.BorderBrush = new SolidColorBrush(Colors.Black);
}
}
}
}
if (Math.Abs(x) > 0)
{
camera.Position = new System.Windows.Media.Media3D.Point3D(x, y , z);
camera.LookDirection = new System.Windows.Media.Media3D.Vector3D(-x, -y , -z);
}
Asinterestingasthiseffectalreadyis,itturnsoutthatthehologramillusionisevenbetterwhen
morecomplex3Dobjectsareintroduced.A3Dcubecaneasilybeconvertedintoanoblongshape,as
illustratedinFigure8-5,simplybyincreasingthescaleofacubeintheZdirection.Thiscreatesan
oblongthatsticksouttowardtheplayer.Wecanalsomultiplythenumberofoblongsbycopyingthe
newoblong’smodelVisual3DelementintotheViewport3Dmultipletimes.Usethetranslatetransformto
placetheseoblongsindifferentlocationsontheXandYaxesandgiveeachadifferentcolor.Sincethe
cameraistheonlyobjectthecode-behindisawareof,transformingandaddingnew3Dobjectstothe
3DviewportdoesnotaffectthewaytheHologramprojectworksatall.
287
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Figure8-5.3Doblongs
LibrariestoKeepanEyeOn
SeverallibrariesandtoolsrelevanttoKinectareexpectedtobeexpandedtoworkwiththeKinectSDK
overthenextyear.Ofthese,themostintriguingareFAAST,Unity3D,andMicrosoftRoboticsDeveloper
Studio.
TheFlexibleActionandArticulatedSkeletonToolkit(FAAST)canbestbethoughtofasamiddlewarelibraryforbridgingthegapbetweenKinect’sgesturalinterfaceandtraditionalinterfaces.Written
andmaintainedbytheInstituteforCreativeTechnologiesattheUniversityofSouthernCalifornia,
FAASTisagesturelibraryinitiallywrittenontopofOpenNIforusewithKinect.Whatmakesthetoolkit
brilliantisthatitfacilitatesthemappingofthesebuilt-ingestureswithalmostanyAPIandevenallows
mappinggesturestokeyboardkeystrokes.Thishasallowedhackerstousethetoolkittoplayavarietyof
videogamesusingtheKinectsensor,includingfirst-personshooterslikeCallofDutyandonlinegames
likeSecondLifeandWorldofWarcraft.Atlastreport,aversionofFAASTisbeingdevelopedtoworkwith
theKinectSDKratherthanOpenNI.YoucanreadmoreaboutFAASTat
http://projects.ict.usc.edu/mxr/faast.
Unity3Disatoolavailableinbothfreeandprofessionalversionsthatmakesthetraditionally
difficulttaskofdeveloping3Dgamesrelativelyeasy.GameswritteninUnity3Dcanbeexportedto
multipleplatformsincludingtheweb,Windows,iOS,iPhone,iPad,Android,Xbox,Playstation,andWii.
Italsosupportsthird-partyadd-insincludingseveralcreatedforKinect,allowingdeveloperstocreate
WindowsgamesthatusetheKinectsensorforinput.FindoutmoreaboutUnity3Dat
http://unity3d.com.
MicrosoftRoboticsDeveloperStudioisMicrosoft’splatformforbuildingsoftwareforrobots.
IntegrationwithKinecthasbeenbuiltintorecentbetasoftheproduct.BesidesaccesstoKinectservices,
KinectsupportalsoincludesspecificationsforareferenceplatformforKinect-enabledrobots(thatmay
eventuallybetransformedintoakitthatcanbepurchased)aswellasanobstacleavoidancesample
usingtheKinectsensor.YoucanlearnmoreaboutMicrosoftRoboticsDeveloperStudioat
http://www.microsoft.com/robotics.
288
www.it-ebooks.info
CHAPTER8BEYONDTHEBASICS
Summary
InthischapteryouhavelearnedthattheKinectSDKcanbeusedwitharangeoflibrariesandtoolsto
createfascinatingandtexturedmashups.YouwereintroducedtotheOpenCVwrapperEmguCV,which
providesaccesstocomplexmathematicalequationsforanalyzingandmodifyingimagedata.Youalso
beganbuildingyourownlibraryofhelperextensionmethodstosimplifythetaskofmakingmultiple
imagemanipulationlibrariesworktogethereffectively.Youbuiltseveralapplicationsexemplifyingfacial
detection,3Dillusions,andaugmentedreality,demonstratinghoweasyitactuallyistocreatetherich
KinectexperiencesyoumighthaveseenontheInternetwhenyouareawareofandknowhowtousethe
righttools.
289
www.it-ebooks.info
APPENDIX
Kinect Math
BuildingapplicationswithKinectisdistinctlydifferentfrombuildingothertypesofapplications.The
challengesandsolutionspresentedineachKinectapplicationarenotcommontomanyC#or.NET
applications,whichfocusondataentryordataprocessing.Itisevenrarerforwebapplicationsor
applicationsbuiltformobiledevices,gamesexcluded,toencounter“Kinectproblems.”Theseproblems
allcomedowntomath.UsingKinectmeansthatdevelopershavetoworkinthree-dimensionalspaces.
GiventheimmaturityofKinectasaninputdevice,itisnotintegratedintographicalinputsystemsthe
wayotherinputdevicesare—forexample,themouseorstylus.Therefore,itisthejobofthedeveloperto
dotheworkwehavehadtheluxuryofnothavingtodoforyears.Asaresult,itisquitepossiblethat
manydevelopershaveneverhadtomanipulatebitsdirectly,transformcoordinatespaces,orworkwith
3Dgraphics.
Hereareseveralmathematicalformulasandtopicsanydeveloperwillencounterwhendeveloping
Kinectexperiences.Thecoveredmaterialisnotcursoryandservesonlyasareferenceorprimertogive
youtheinformationneededtoresolvemostproblemsquickly.Weencourageyoutopulloutoldmath
books(youkeptthemright?)andstudymore.
UnitofMeasure
Kinectmeasuresdepthinmillimeters.Whenprocessingrawdepthdata,thevaluesareinmillimeters.
Thevectorpositionsofskeletonjointsareinmeters.
1000 mm = 1 m
1mm = 0.0032808399ft
1 m = 3.2808399 ft
BitManipulation
Rarelydomodernapplicationsworkwithdataatabitlevel.Mostapplicationsdonotneedtoprocessor
manipulatebitdataandiftheneedarises,therearetoolsandlibrariesthatdotheactualwork.These
toolsandlibrariesabstractthebitmanipulationandprocessingfromthedeveloper.Whenworkingwith
Kinect,therearetwoinstanceswhereadeveloperwillneedtomanipulatebinarydataatthebitlevel.
Anyapplicationthatprocessesdepthdatafromthedepthimagestreamhastomanipulatebitdatato
extractthedepthvalueforagivenpixelposition.Theotheroccasiontoworkwithbitsiswhenworking
withtheQualitypropertiesonSkeletonFrameandSkeletonobjects.
291
www.it-ebooks.info
APPENDIXKINECTMATH
BitFields
Anyenumerationinthe.NETframeworkdecoratedwiththeFlagsAttributeattributeisabitfield:a
collectionofmutuallyexclusiveswitchesorflagsstoredinasinglevariable.Eachbitinthevariable
representsaflag.Theunderlyingdatatypeofabitfieldisaninteger.Bitfieldsdramaticallyreducethe
amountofmemoryneededtotrackBooleandata.Avariableoftypeboolrequiresonebyteofspaceand
maintainsthestateofasingleflag,whereasonebyteofabitfieldtracksthestateofeightflags.
TheFrameEdgesenumerationisdefinedasbitfields.Enumerationsofthistypehavevaluesthatare
2
powersoftwo.Forexample,FrameEdges.Tophasanintegervalueoffour(4=2 ),whichmeansthatthe
Topflagisthethirdbit.Theexponentdefinestheindexpositionofthebitflag.TableA-1showsthe
integerandbinaryvaluesfortheFrameEdgesenumeration.
TableA-1.FrameEdgesBitFlags
None0
00000000
Right1
00000001
Left2
00000010
Top4
00000100
Bottom8
00001000
Whilecommontoworkwithbitfieldsinthe.NETframework,rarelyisitnecessarytoknowthe
actualvaluesofthebits.WepresentTableA-1moretoillustratehowbitfieldswork,whichisimportant
whenmanipulatingthebitflagsofabitfield.Bitmanipulationusesasetofbitwiseoperatorsthatmask
specificbits.Bitmasksworkbytakingonesetofbitsandapplyingalogicaloperationusinganotherset
ofbits.Thesecondoperandinthislogicalequationiscalledthemask.Therearethreecorelogical
operationsusedtomaskbits.Developersemployanyoneoftheselogicaloperationstoturnon,turnoff
orcomplimentbits.
BitwiseOR
ThebitwiseORoperator,denotedbythe|(pipe)character,isthebitmanipulationfunction,whichturns
bitson.Useittosetvaluesofabitflag.Anexampleusecaseistobuildcriteriatotestaskeleton’s
quality.Forexample,iftheskeletonisclippedontheleftortheright,theapplicationmightwantto
messagetheusertomoveclosertothecenterofKinect’sviewarea.Toperformthistest,theapplication
needstobuildatestoperand,andthisisaccomplishedusingthebitwiseORoperator.ThebitwiseAND
operatorisusedtoperformtheactualtest.Thecodetobuildtheoperandisasfollows:
FrameEdges testOperand = (FrameEdges.Left | FrameEdges.Right);
ThislineofcodestellsthesystemtoapplyalogicalORtothespecifiedvaluestocreateanewvalue.
TheresultofthelogicalORoperationisstoredinanewvariable.FigureA-1demonstratesalogicalOR
operationatthebitlevel.
292
www.it-ebooks.info
APPENDIXKINECTMATH
FigureA-1.LogicalORappliedtobitfields
ThebitwiseORoperationcompareseachbitoftheoperands.Theresultiszeroifthebitofboth
operandsiszero,andoneifeitheroftheoperandbitsisone.TheoperationisnamedOR,becauseifthe
bitofoneoperandortheotherison(1)thentheresultison.Forexample,let’ssaythebitsarein
positionzero.Bitzeroforthefirstoperand(FrameEdges.Left)isoff(0),butisonforthesecondoperand.
TheeffectofalogicalORisthatbitpositionzeroisturnedon.However,forbitpositionthreetheresultis
zero,becausethethirdbitofbothoperandsiszero.
BitwiseAND
WherebitwiseORturnsonbits,thebitwiseANDturnsthemoff.Additionally,thebitwiseANDisusedtotest
theonstateofcertainbits.TheANDoperationcompareseachbitoftwooperands.Fortheresulttobe
one,thebitofbothoperandsmustbeone,otherwisetheresultiszero.FigureA-2exemplifiesthebitwise
ANDoperation.
FigureA-2.ApplyingabitwiseANDmask
UsingthebitwiseANDtodetermineifaspecificbitorbitsareonisacommonapplicationofthe
operator.Totestforspecificonbits,applyabitmasktoavalue,wherethebitmaskistheexactsetofbits
desiredtobeon.Ifthebitsspecifiedinthemaskareontheresultisequaltothebitmask.FigureA-3
demonstratesthis.NoticethatOperand2(thebitmask)isthesamebitpatternastheresult.
293
www.it-ebooks.info
APPENDIXKINECTMATH
FigureA-3.Testingforbits
Incode,the&symbolrepresentsthebitwiseANDoperation.ListingA-1demonstratesasampleuse
caseofthebitwiseANDoperation.Thecodeischeckingaskeletonobjecttodetermineiftheleftsideof
theskeletonisbeingclipped,andifthisisthecase,italertstheusertomovetotheright.ThebitwiseAND
ismasksallbitsexceptfortheSkeletonQuality.ClippedLeftbit.Ifthisbitison,theresultisequalto
SkeletonQuality.ClippedLeft.FigureA-4showsbitwiseANDoperationatthebitlevel.
ListingA-1.TestingforBitsinCode
if(skeleton.Quality & FrameEdges.Left == FrameEdges.Left)
{
//Alert the user to move to the right
}
FigureA-4BitwiseANDbitmath
Inthepreviousexampleofabitmask,onlychecksforasinglebit.Totestformultipleclippededges
(multiplebits),calculatethebitmaskbyOR’ing(usingthebitwiseORoperator)multiplevalues
together.ListingA-2demonstratesthisincodeandFigureA-4showsthebitmathoperations.
ListingA-2Buildingandtestingacomplexbitmask
FrameEdges edgesBitMask = FrameEdges.Left | FrameEdges.Bottom;
if(skeleton.Quality & edgesBitMask == edgesBitMask)
{
//Alert user that they are outside of the left and bottom boundaries.
}
294
www.it-ebooks.info
APPENDIXKINECTMATH
FigureA-5Bitwisemathforacomplexbitmask
InKinectapplicationdevelopmentacommonuseofthebitwiseANDoperationiswhenextracting
playerindexvalues.TheSDKconvenientlyprovidesdevelopersabitmasktouse.Thebitmaskisa
constantdefinedontheDepthImageFrameclassnamedPlayerIndexBitMask.Thevalueis7(00000111),
whenappliedtodataforadepthpixelmasksallbitsexceptfortheplayerindexbits.ListingA-3contains
samplecodethatiteratesthroughdepthpixeldataandextractstheplayerindexvalue.FigureA-6
illustratesthebitmathofthebitwiseANDoperation.
ListingA-3 Extracting the player index from depth data
for(int i = 0; i < pixelData.Length; i++)
{
playerIndex = pixelData[i] & DepthImageFrame.PlayerIndexBitMask;
//Do stuff with the player index value
}
FigureA-6Bitmathtoextracttheplayerindexvalue
BitwiseNOT(Complement)
Whenworkingwithdepthbits,Chapter3usedthebitwisecomplementoperatortoinvertthebits.The
bitwisecomplementoperatorinC#isthe~(tilde)symbol.Listing3-7usedtheoperatortoinvertthe
colorsofthedepthbits.Depthvaluesinthebitsofadepthimageframerangefrom0to4095.Ifthese
bitsareusedasistocreateimages,theshadesofgrayaremuchclosertoblackthanwhite.Simply
invertingthebitsproducesgraysthatareclosertowhitethantheyaretoblack.
295
www.it-ebooks.info
APPENDIXKINECTMATH
Inadepthimage,thedepthvalueofeachpixelis16bitsortwobytes.Ina16-bitcolorpalette,black
iszeroand65535iswhite.Theintegervaluewhenall16bitsareturnedonis65535,whichmakesitthe
complimentofzero.FigureA-4showsanotherexampleofthebitwisecomplimentoperation.
FigureA-7.Complementingbits
BitShifting
Whenlookingatthebitsofa16-bitnumber(shortdatatypein.NET)asinFigureA-5,theleast
significantbitisbitzeroonthefarright.Bit15,onthefarleft,isthemostsignificantbit.Endianness
describestheorderofbytesstoredinmemory.Ifbits8-15arestoredfirstinmemoryfirst,thenthebytes
areinbigendianorder.Littleendianorderistheoppositeandiswhenbits0to7arestoredinmemory
beforebits8-15.
FigureA-8.Bitsignificance
Bitshiftingisanoperationofmovingorshiftingbitseitherleftorright.Afterashiftthesignificance
ofabitissaidtohavechanged,howsodependsonthedirectionoftheshift.InC#thebitwiseshift
operatorsare>>toshiftrightand<<toshiftleft.FigureA-8demonstratesthebasicsofbitshifting.Itfirst
showstheresultofshiftingrighttwobits.Notice,forexample,howthebitvalueofbit2isnowinbit
0andthevalueofbit4isnowatbit2.Thevaluesofbits15and16automaticallybecomezero,
conversely,whenshiftingbitsleftbits0and1aresettozero.
FigureA-9.Shiftingbits
Bitshiftingisusedtoextracttheactualdepthfromdepthpixeldata.Depthpixeldataconsistsof16bitswherethefirstthree(0-2)storetheplayerindexvalueandtheremainingbits(3-15)storethedepth
ofthepixelmeasuredinmillimeters.Togetdepthvalueforthepixel,youmustshiftouttheplayerindex
bits.TheSDKdefinesaconstantontheDepthImageFrameclassnamedPlayerIndexBitMaskWidth.Listing
296
www.it-ebooks.info
APPENDIXKINECTMATH
A-4showscodetoextractthedepthfromdepthpixeldata,andFigure-A10showstheoperationatabit
level.
ListingA-4Extractingdepth
depth = pixelData[pixelIndex] >> DepthImageFrame.PlayerIndexBitMaskWidth;
FigureA-10Extractingdepths
GeometryandTrigonometry
UntilthespecializeddomainofKinectdevelopmentmatures,developerswillhavetowritecodeto
detectposes,gestures,orcontrolinterfaceinteractions.Frequently,thiscodewillinvolvegeometryand
trigonometry.IncludedareseveralformulasandcommonusesofbothinKinectdevelopment.Thistype
ofmathisusedtotriangulatethepositionofplayers,calculatetheanglesofjoints,anddetermine
distancesfromobjects.Kinectisa3Dinputdevicemappinga3Dworld,andevenifyourapplication
doesnotinvolve3Dgraphics,manyofthesamefundamentalsapply.
Remember,whentriangulatingjointsandtheirrelationtootherobjectswithincoordinatespace,
youonlyneedtwopointstocreateatriangle.Thethirdpointcanbearbitraryorderivedfromtheother
twopoints.Oftenwetryingtocalculatesomepropertyrelatedtothetwovectors,wherethevalueofthe
thirdvectorhasnomaterialeffectonthecalculatedresult.
FigureA-11showstheformulastocalculatedistancesbetweentwopoints.Thefirstformulaisfor
two-dimensionalpoints,andthesecondisforthree-dimensionaldistances.Thesefunctionsarebuilt
intothe.NETframework,butintwodifferentlocations:System.Windows.Vectorand
System.Windows.Media.Media3D.Vector3D.However,thisrequiresyourapplicationtoconvertthe
skeletonpointsintoeitherofthesetwoobjects.Attimes,thisextraoverheadisproblematic,requiring
thecalculationstobedonemanually,whichiswhytheyareincludedhere.
FigureA-11.Distancebetweentwopoints
Thetriangulationofpointsusesbasictrigonometricfunctions.FigureA-12liststhebasic
trigonometricfunctions,theirinversefunctions,aswellasthePythagoreantheorem.Whenappliedto
Kinect,allsidesofthetriangleareknownbycalculatingthedistance(FigureA-11)betweenthethree
pointsofthetriangle.
297
www.it-ebooks.info
APPENDIXKINECTMATH
FigureA-12.Basictrigonometryfunctionsforrighttriangles
Thetrigonometricfunctionsin.NEToperatefromradianvaluesratherthandegrees.Itiscommon
totranslatefromdegreestoradiansandback.Neitherthe.NETframeworknortheKinectSDKprovides
built-infunctionalityfortheseconversions.Thisislefttothedeveloper.FigureA-13showsthese
formulas,andFigureA-14showstheunitcircle.
FigureA-13.Degreeandradianconversions
FigureA-14.Theunitcircle
298
www.it-ebooks.info
APPENDIXKINECTMATH
TheLawofCosines,asshowninFigureA-15,calculatestheangleoftypeoftriangle.Thisisuseful
whendeterminingtheanglebetweentwojoints,asdemonstratedinChapter5.Applingtheformulafor
thispurposerequiresathirdpoint,whichcanbeanotherjointposition,butgenerallyshouldbeapoint
alongtheX-axisfromthebasepoint.ThelargestanglecalculablebytheLawofCosinesis180.When
calculatingtheanglesbetweenjoints,thismeansadditionalcalculatingtodetermineanglesfrom180to
360,butthisistrivial.AnotherwaytocalculatetheangleofjointsisthroughtheDotProductoftwo
vectors,wherethepositionoftheanglejointsisavector.WPFdoesprovidehelpnotjustwith
calculatingDotProducts,butalsowithcalculatingtheanglebetweentwovectors.Inthe
System.Windows.Media.Media3D,namespaceisaclassnamedVector3D.TheVector3Dclasshasseveral
methodsforworkingwithvectors,includingDotProductandAngleBetween.Itispossibletousethe
Vector3Dclasswith2Dvectorsaswell:justassignzerototheZproperty.
FigureA-15.LawofCosines
299
www.it-ebooks.info
Index
A
AudioCapture
acousticechocancelation,240–241
noisesuppression,238–240
soundstream
bindableproperties,235
instantiatingandconfiguring,236
RecordandPlaybackbuttons,235
RecordandPlaybackmethods,237
RecorderHelperclass,232
Recorderwindow,231
TrackingRecordingState,234–235
writingtoWavfile,232–233
B
Beamtracking
BeamChangedeventhandler,244
Indicator,241
MainWindowclassimplementation,243
Speechdirectionindicator,242
C
Coding4FunKinectToolkit,256–258
browsablesourcecode,257
extensionmethods,257–258
AddOnemethod,257
bitmapmethods,262–263
bitmapsourcemethods,263
conversionmethods,264–265
ImageExtensionsclass,260–262
inMainWindowcodebehind,264
pixelformats,263
WPFproject,259–260
WinForms,256
D
Depthimageprocessing,49
application,65–66
grayshades
applicationoutput,57
coloring,59,60
comparation,61
lightshade,57
newStackPanelandImageelement,56
newversioncreation,58–59
visualizations,59
histogram
building,62–63
MainWindow.xamlupdation,62
outputimage,64
statisticaldistributions,62
thresholding,65
wallfilteredoutandholdingnewspaper
frontofimage,64–65
imagealignment
backgroundsubtractionprocess,79–80
greenscreening,76
mappingmethod,80
noisypixels,80
pollinginfrastructure,77–79
stereovision,76
XAML,76
measurement
bitmanipulation,53
CalculatePlayerSizemethod,71–73
depthbitslayout,52
differentposes,75
displaydepthvalues,53
fieldofview,51
hard-code,54
ItemsControl,71
mouse-upeventhandler,53–54
outputdisplay,55
outputproperties,75
301
www.it-ebooks.info
INDEX
Depthimageprocessing,measurement(cont.)
PlayerDepthDataclass,73–75
playerrealworldwidth,70
playerwidthandheight,69
UI,70,71
nearmode,81–82
playerindexing
displayinginblackandwhite,67–68
indexbits,67
KinectExplorer,69
rawandprocessedimages,67,69
rawdepthimageframe,51
steps,49
thresholding,61
E
EmguCV,272
F
Facialdetection,279
algorithm,280
augmentedrealityimplementation,282
FaceFinder,279
Pulsemethod,280
setupcode,280
skeletaltracking,281
FlexibleActionandArticulatedSkeletonToolkit
(FAAST),288
G
Gestures,167
affordanceandfeedback,173–174
arbitraryandconventional,169
aspects,169
conventions,173
definition,167
inarts,167
centraldistinction,168
challenges,168
dictionaryapproach,167
inHuman-ComputerCommunication,
168
inhuman-computer-interaction,169
futureaspects,220–221
gesturalidioms,173
gesturedetection
algorithms,175
exemplarapproach,176
handtracking(seeHandtracking,
gestures)
hoverbutton,200–202
magnetbutton,204–210
magneticslide,214–217
neuralnetworks,175–176
pushbutton,203–204
swipe,210–214
universalpause,219
verticalscroll,217–219
wavegesture,177–183
interactionidioms,172
limitations,221
Naturaluserinterface(NUI),170
buttons,171
characteristics,171
directmanipulation,170
gesturalinterfaces,171
goals,170
posing,171
speechinterfaces,171
touchinterfaces,171
inrestaurant,169
tasks,167
H
Handtracking,gestures
CursorAdornerclass
AdornerBaseClassmethod,188–189
cursoranimations,190–191
passingcoordinatepositions,189–190
visualelement,187–188
KinectButton
BaseClass,199
BaseImplementation,197–199
clickevent,200
KinectCursorEventArgs
constructoroverloads,184–185
structure,184
KinectCursorManager
constructors,193–195
eventmanagement,195–196
helpermethods,192–193
MapSkeletonPointToDepthmethod,197
SkeletonFrameReadymethod,196
UpdateCursormethod,196
KinectInput,eventdeclaration,185–187
KinecttoWPFdatatranslation,197–199
302
www.it-ebooks.info
INDEX
Holograms,283
Blender,284
configuration,286
3Dcube,284
3Doblongs,287
movingthecameraaroundcude,286
Viewport3Dobject,284
I, J
ImageManipulationhelpermethods,256
K
Kinect,1
applications,16
Explorer,17–18
hardware
components,10
glossyblackcase,9
microphonearray,10
powersource,10
requirements,11
zoom,10
imageprocessing(seeDepthimage
processing)
installation
acousticmodels,13
drivers,13
instructions,12
microphonearray,13
OpenNI,12
MicrosoftResearch(MSR)
microphonearray,7
motion-tracking,5
playerblob,6
playerparts,6
MinorityReport,2
Nataldevice,4
RecordAudio,20
ShapeGame,18–19
skeletontrack(seeSkeletontracking)
softwarerequirements,11–12
speechsample,20–21
timeofflighttechnique,4
visionrecognition,4
VisualStudioProject
addreference,14
applications,14
basicsteps,14
KinectDepthStreamData,16
KinectSensorobject,15
WiiRemote,3
Kinect3DV,3
KinectMath,291
bitfields,292
bitmanipulation,291
bitshifting,296
bitwiseANDoperator,293
bitwiseNOT(complement)operator,295
bitwiseORoperator,292
geometry,297
trigonometry
degreeandradianconversions,298
distancebetweentwopoints,297
LawofCosines,299
forrighttriangles,298
unitcircle,298
unitofmeasures,291
Kinectsensor,23
collectionobject,24
colorimagestream
display,30
enablemethod,29
frame-readyeventhandler,30
videocamera,28
datastreams,24
detectionandmonitoring,25–27
hardware,24
imagemanipulation
Kinect_ColorFrameReadyeventhandler,
32
pixelshading,33–34
imageperformance
bitmapimagecreation,31
frameimage,31
imagepixelupdation,32
memoryallocationanddeallocation,32
loadedandunloadedevents,27
objectreflection
classdiagram,36,37
colorimagestreamformats,38
ImageStreamclass,38
Timestamp,39
pollingapplication
advantages,46
basecode,40
discoverandinitialization,41,42
OpenNextFramemethod,39
PollColorImageStreammethod,42
renderingevent,41
303
www.it-ebooks.info
INDEX
Kinectsensor,pollingapplication(cont.)
thread,43–44
UIthreadUpdation,45–46
snapshots
addbutton,35
TakePictureButton_Clickeventhandler,
35
testapplication,36
XboxKinectgames,34
startingprocess,28
StatusChangedeventhandler,27
statusvalues,24
stoppingprocess,28
threadsafetyandreleaseresources,28
wrapper,27
KinecttheDotsgame
expansion,112–113
featureset,100
handtracking
applicationoutput,105
cursorpositionupdation,104
locationandmovements,102
primaryhand,103,104
SkeletonFrameReadyEventHandler,
102–103
puzzledrawing,106–108
solvingpuzzle,108–112
userinterface
image,101
ViewboxandGridelements,101
XAML,101
L
Libfreenect,12
M, N
Microphonearray,223,224
KinectAudioSource
AcousticEchoCancellation(AEC),225
AcousticEchoSuppression(AES),225
AutomaticGainControl(AGC),225
BeamAngleMode,227
BeamForming,225
CenterClipping,225
EchoCancellationMode,226
featureproperties,226
FrameSize,225
GainBounding,225
NoiseFilling,225
NoiseSuppression(NS),225
Optibeam,225
SingleChannel,225
Signal-to-NoiseRatio(SNR),225
speechrecognition(seealsoSpeech
recognition)
commandrecognition,227
Engineobjectconfiguration,230
free-formdictation,227
grammars,228–229
overloadedmethods,230
resultproperties,229
SetInputToAudioStreammethod,230
VoiceCaptureDirectXMedioObject
(DMO),224
MicrosoftKinectAudioArrayControl,12
MicrosoftKinectCamera,12
MicrosoftKinectDevice,12
MicrosoftRoboticsDeveloperStudio,288
O
OpenCV(OpenComputerVision),272
P, Q, R
ProximityDetection,265
withdepthdata,269–270
_isTracking,266
motiondetection
algorithm,275–277
configuration,275
EmguCV,272
Emguextensionmethods,273–274
OpenCV,272
pollingtechnique,274
strategies,272
withplayerdataanddepthdata,271
savingthevideo
RecordandStopRecordingmethods,
279
RecordingVideo,278
SkeletonFrameReadyevent,267
S
SimonSaysgame
applicationoutput,139
buildinfrastructure,133–135
304
www.it-ebooks.info
INDEX
ChangePhasemethod,139–140
commands,140–142
enhancements,145,163
indicators,145
instructionsequence,129,146
interactivecomponents,131
jointpositioncalculation,161
output,164,165
playinfrastructure,135–136
PoseAngleclass,158
posedetection,157
poselibrarycreation,159
presentation,146
ProcessGameOverupdation,160–161
processingplayermovements,142–143
serialization,163
SkeletonFrameReadyEventHandler,
136–137
startinggame,137–139
timerinitialization,162
UIcomponents,131
UIElementobject,144
userexperience,145
userinteraction,146
userinterface,130
usermovementdetection,144–145
XAML,131–133
Skeletontracking,85,121
applicationdisplays,90,91
arrayofbrushes,87
coordinatespace,113,114
depth-baseduserinteraction
handindifferentposition,151
handinsameposition,151,153
handtracking,150–151
layoutsystem,146
visualelements,149
XAMLupdation,147–149
drawingskeletonjoints,89
hittesting
Button,126
Canvaspanel,128
definition,125
dotproximity,125
GridandStackPanel,128
handcursors,129
InputHitTestmethod,127
layoutspaceandboundingboxes,127,
128
NaturalUserInterfacedesign,129
positionpoints,129
ShapeGameapplication,125,126
visualelementlayering,126,127
initialization,86–87
KinecttheDots
featureset,100
gameexpansion,112–113
handtracking,102–105
puzzledrawing,106–108
solvingpuzzle,108–112
userinterface,101–102
KinectSensorobject,85
mirroredeffect,115
objectmodel
class,96
ClippedEdgesfield,98
enableanddisablemethods,93
framedescriptors,96
FrameNumberandTimestampfields,
95–96
identifier,97
jointobject,98–100
positionfield,97
schematicrepresentation,91,92
selectionsof,95
SkeletonFrameobjects,95
SkeletonStream,93
smoothing,93–94
TrackingStatevalues,97
poses
execution,157
gesture,153
hittesting,154
jointpositioncalculation,156
jointtriangulationmethod,156
Kinect,154
LawofCosines,155
Tpose,155
typeandcomplexity,154
umpires,153
SkeletonViewerusercontrol
dependencyproperty,116–117
drawingjoints,117–119
initialization,119
KinecttheDots,119,120
XAML,116
spacetransformations,114
stickfigurecreation,87–89
userinteraction
detection,124–125
graphicaluserinterface,122
touch/stylusdevices,122
305
www.it-ebooks.info
BeginningKinect
Programmingwiththe
MicrosoftKinectSDK
JarrettWebb
JamesAshley
www.it-ebooks.info
BeginningKinectProgrammingwiththeMicrosoftKinectSDK
Copyright©2012byJarrettWebb,JamesAshley
Thisworkissubj ecttocopyright. Allrightsareres ervedbythePublisher,whetherthewholeorpartof thematerialis
concerned, specificallyth erightsof translati on,repr inting,reuseo fillus trations,recitation,broadcasti ng,
reproductiononmicrof ilmsori nany otherphysicalway ,an dt ransmissionori nformationstor agean dre trieval,
electronicadaptation,computersof tware,orbysi milarord issimilarmethodologynowknownorhereaf
ter
developed.Exemptedfromthislegalreservationarebriefexcerptsinconnectionwithreviewsorscholarlyanalysisor
materialsuppliedspecificallyforthepurposeofbeingenteredandexecutedonacomputersystem,forex clusiveuse
bythepurchaserofthework.Duplicationofthispublicationorpartsthereofispermittedonlyundertheprovisionsof
theCopyrightLawofthePublisher'slocation,ini tscurrentversion,andpermissionforusemustalwaysbeobtained
fromSpringer.PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violations
areliabletoprosecutionundertherespectiveCopyrightLaw.
ISBN-13(pbk):978-1-4302-4104-1
ISBN-13(electronic):978-1-4302-4101-8
Trademarkedn ames,logos,an d imagesmayapp earin this book.Ratherthanus eatrademarks ymbolwith every
occurrenceofatrademarkedname,logo,orima geweusethe names,logos,and imagesonlyina neditorialf ashion
andtothebenefitofthetrademarkowner,withnointentionofinfringementofthetrademark.
Theuseinthis publicationof tradenames,tr ademarks, servicema rks,a nds imilarterms,ev enifth eya re not
identifiedassuch,isnottobeta kenasanexpres sionofopinion astowhethe rornottheyaresubjecttoproprietary
rights.
Whiletheadviceandinf ormationinthis bookarebelievedtobe trueandaccurateatthedateof publication,neither
theauthorsnor theeditorsnorth epublishercanacceptanylegal responsibilityforanyerrorsoro missionsthatmay
bemade.Thepublishermakesnowarranty,expressorimplied,withrespecttothematerialcontainedherein.
PresidentandPublisher:PaulManning
LeadEditor:JonathanGennick
TechnicalReviewer:StevenDawson,AlastairAitchison
EditorialBoard: Steve Anglin,M arkBe ckner,Ew anBuckingham, GaryCorne ll,LouiseCorrigan,MorganErtel,
JonathanGe nnick,Jon athanHasse ll,Robe rtHutchin son,Mich elleLowman, Jame sMarkham, Matthew
Moodie,J eff Olson,J effreyPepp er, DouglasPu ndick,Be nR enow-Clarke,D ominicSha keshaft,G wenan
Spearing,MattWade,TomWelsh
CoordinatingEditor:AnnieBeck,BrentDubi
CopyEditor:JillSteinberg
Compositor:BythewayPublishingServices
Indexer:SPIGlobal
Artist:SPIGlobal
CoverDesigner:AnnaIshchenko
DistributedtothebooktradeworldwidebySpringerScie nce+BusinessMediaNewYork, 233SpringStreet,6thFloor,
NewYo rk,N Y10 013.P hone1 -800-SPRINGER,fa x( 201)34 8-4505,e-m ail orders-ny@springer-sbm.com,orvisit
www.springeronline.com.
Forinformationontranslations,please e-mail rights@apress.com,orvisitwww.apress.com.
Apressand friendsofEDbook smaybe purchasedinbulkf oracademic,corporate,orpromo tionaluse.eBoo k
versionsandlicensesarealsoavailableformostti tles.Formoreinformation,referenceourSpecialBulkSales–eBook
Licensingwebpageatwww.apress.com/bulk-sales.
Anysourcecodeorothersupplementary materialsref erencedbytheauthori n thiste xtisav ailabletore adersat
www.apress.com.Fordetailedinformationabouthowtolocateyourbook’ssourcecode,goto www.apress.com/sourcecode/.
ii
www.it-ebooks.info
Wededicatethisbooktoourfamilies,Meredith,Tamara,Sasha,PaulandSophia,whosupported
us,stoodbyus,cheeredusonandwereexceedinglypatientwithusthroughthisprocess.
––JarrettandJames
iii
www.it-ebooks.info
Contents
AbouttheAuthors. .................................................................................................xi
AbouttheTechnicalReviewer . .............................................................................xii
Acknowledgments . ..............................................................................................xiii
Introduction . ....................................................................................................... xiv
Chapter1:GettingStarted ......................................................................................1
TheKinectCreationStory ..................................................................................................1
Pre-History. .............................................................................................................................................. 1
TheMinorityReport . ................................................................................................................................ 2
Microsoft’sSecretProject . ...................................................................................................................... 3
MicrosoftResearch . ................................................................................................................................ 5
TheRacetoHackKinect. ......................................................................................................................... 7
TheKinectforWindowsSDK .............................................................................................9
UnderstandingtheHardware. .................................................................................................................. 9
KinectforWindowsSDKHardwareandSoftwareRequirements........................................................... 11
Step-By-StepInstallation . ..................................................................................................................... 12
ElementsofaKinectVisualStudioProject......................................................................14
TheKinectSDKSampleApplications...............................................................................16
KinectExplorer . ..................................................................................................................................... 17
ShapeGame . ......................................................................................................................................... 18
RecordAudio . ........................................................................................................................................ 20
SpeechSample. ..................................................................................................................................... 20
v
www.it-ebooks.info
CONTENTS
Summary .........................................................................................................................22
Chapter2:ApplicationFundamentals...................................................................23
TheKinectSensor............................................................................................................24
DiscoveringConnectedaSensor............................................................................................................ 24
StartingtheSensor................................................................................................................................. 28
StoppingtheSensor ............................................................................................................................... 28
TheColorImageStream ..................................................................................................28
BetterImagePerformance ..................................................................................................................... 31
SimpleImageManipulation.................................................................................................................... 32
TakingaSnapshot.................................................................................................................................. 34
Reflectingontheobjects.................................................................................................36
DataRetrieval:EventsandPolling...................................................................................39
Summary .........................................................................................................................46
Chapter3:DepthImageProcessing .....................................................................49
SeeingThroughtheEyesoftheKinect............................................................................49
MeasuringDepth .............................................................................................................51
EnhancedDepthImages..................................................................................................56
BetterShadesofGray............................................................................................................................. 56
ColorDepth............................................................................................................................................. 59
SimpleDepthImageProcessing......................................................................................61
Histograms ............................................................................................................................................. 62
FurtherReading...................................................................................................................................... 65
DepthandPlayerIndexing...............................................................................................66
TakingMeasure ...............................................................................................................69
AligningDepthandVideoImages....................................................................................76
DepthNearMode.............................................................................................................81
vi
www.it-ebooks.info
CONTENTS
Summary .........................................................................................................................82
Chapter4:SkeletonTracking ...............................................................................85
SeekingSkeletons ...........................................................................................................85
TheSkeletonObjectModel..............................................................................................91
SkeletonStream ...................................................................................................................................... 93
SkeletonFrame ....................................................................................................................................... 95
Skeleton.................................................................................................................................................. 96
Joint........................................................................................................................................................ 99
KinecttheDots ..............................................................................................................100
TheUserInterface ................................................................................................................................ 101
HandTracking ...................................................................................................................................... 102
DrawingthePuzzle............................................................................................................................... 106
SolvingthePuzzle ................................................................................................................................ 108
ExpandingtheGame............................................................................................................................. 112
SpaceandTransforms...................................................................................................113
SpaceTransformations ........................................................................................................................ 114
LookingintheMirror ............................................................................................................................ 115
SkeletonViewerUserControl .........................................................................................115
Summary .......................................................................................................................120
Chapter5:AdvancedSkeletonTracking.............................................................121
UserInteraction .............................................................................................................122
ABriefUnderstandingoftheWPFInputSystem .................................................................................. 122
DetectingUserInteraction.................................................................................................................... 124
SimonSays....................................................................................................................129
SimonSays,“DesignaUserInterface”................................................................................................ 131
SimonSays,“BuildtheInfrastructure” ................................................................................................ 133
SimonSays,“AddGamePlayInfrastructure” ...................................................................................... 135
vii
www.it-ebooks.info
CONTENTS
StartingaNewGame............................................................................................................................ 138
EnhancingSimonSays ......................................................................................................................... 145
ReflectingonSimonSays..................................................................................................................... 146
Depth-BasedUserInteraction........................................................................................146
Poses .............................................................................................................................153
PoseDetection...................................................................................................................................... 154
ReactingtoPoses................................................................................................................................. 156
SimonSaysRevisited ....................................................................................................157
ReflectandRefactor ......................................................................................................163
Summary .......................................................................................................................166
Chapter6:Gestures ............................................................................................167
DefiningaGesture .........................................................................................................167
NUI .................................................................................................................................170
WhereDoGesturesComeFrom?...................................................................................172
ImplementingGestures..................................................................................................174
AlgorithmicDetection ........................................................................................................................... 175
NeuralNetworks................................................................................................................................... 175
DetectionbyExample........................................................................................................................... 176
DetectingCommonGestures .........................................................................................176
TheWave.............................................................................................................................................. 177
BasicHandTracking............................................................................................................................. 183
HoverButton......................................................................................................................................... 200
PushButton .......................................................................................................................................... 203
MagnetButton ...................................................................................................................................... 204
Swipe.................................................................................................................................................... 210
MagneticSlide...................................................................................................................................... 214
VerticalScroll ....................................................................................................................................... 217
viii
www.it-ebooks.info
CONTENTS
UniversalPause .................................................................................................................................... 219
TheFutureofGestures ..................................................................................................220
Summary .......................................................................................................................222
Chapter7:Speech...............................................................................................223
MicrophoneArrayBasics...............................................................................................224
MSRKinectAudio ................................................................................................................................. 224
SpeechRecognition.............................................................................................................................. 227
AudioCapture ................................................................................................................231
WorkingwiththeSoundStream........................................................................................................... 231
CleaningUptheSound ......................................................................................................................... 238
CancelingAcousticEcho ...................................................................................................................... 240
BeamTrackingforaDirectionalMicrophone ................................................................241
SpeechRecognition.......................................................................................................244
Summary .......................................................................................................................254
Chapter8:BeyondtheBasics.............................................................................255
ImageManipulationHelperMethods.............................................................................256
TheCoding4FunKinectToolkit ............................................................................................................. 256
YourOwnExtensionMethods............................................................................................................... 258
ProximityDetection........................................................................................................265
SimpleProximityDetection .................................................................................................................. 265
ProximityDetectionwithDepthData.................................................................................................... 269
RefiningProximityDetection ................................................................................................................ 270
DetectingMotion .................................................................................................................................. 272
SavingtheVideo................................................................................................................................... 277
IdentifyingFaces ...........................................................................................................279
Holograms .....................................................................................................................283
LibrariestoKeepanEyeOn...........................................................................................288
ix
www.it-ebooks.info
CONTENTS
Summary .......................................................................................................................289
Appendix:KinectMath........................................................................................291
UnitofMeasure .............................................................................................................291
BitManipulation.............................................................................................................291
BitFields........................................................................................................................292
BitwiseOR .....................................................................................................................292
BitwiseAND ...................................................................................................................293
BitwiseNOT(Complement)............................................................................................295
BitShifting .....................................................................................................................296
GeometryandTrigonometry ..........................................................................................297
Index ...................................................................................................................301
x
www.it-ebooks.info
About the Authors
JarrettWebbcreatesimaginative,dynamic,interactive,immersive,
experiencesusingmulti-touchandtheKinect.HelivesinAustin,Texas.
JamesAshleyhasbeendevelopingprimarilyMicrosoftsoftware
fornearly15years.Hemaintainsablogat
www.imaginativeuniversal.com.HehelpsruntheAtlantaXAML
usergroupandforthepasttwoyearshasledtheorganizingofthe
reMIXconferenceinAtlanta.Heiscurrentlyemployedasa
PresentationLayerArchitectintheEmergingExperiencesGroupat
Razorfishwhereheisencouragedtoplaywithexpensivetechnology
anddevelopimpossibleapplications.Itisthesortofjobhealways
knewhewantedbutdidn’trealizeactuallyexisted.
JameslivesinAtlanta,Georgiawithhiswife,Tamara,andthree
children:Sophia,PaulandSasha.Youcancontacthimbyemailat
jamesashley@imaginativeuniversal.comorcontacthimontwitterat@jamesashley.
xi
www.it-ebooks.info
About the Technical Reviewer
SteveDawsonistheTechnologyDirectoroftheEmergingExperiences
groupatRazorfish.Hecollaborateswithacross-functionalteamofstrategists
anddesignerstobringinnovativeandengagingexperiencestolifeusingthe
latesttechnologies.
AsthetechnicaldirectoroftheEmergingExperiencesgroup,Steve
ledthetechnologyefforttolaunchthefirstMicrosoftSurfacesolution
worldwideandhasdeployedmorethan4,000experiencesinthefieldfora
varietyofclientsutilizingemergingtechnology.
Inadditiontosupportingclientsolutions,Steveisresponsiblefor
R&DeffortswithinRazorfishrelatedtoemergingtechnologies–surface
computing,augmentedreality,ubiquitouscomputing,computervisionand
gesturalinterfacedevelopmentusingKinectforWindows.Hisrecentwork
withtheKinectplatformhasbeenrecognizedandpraisedbyavarietyof
mediaoutletsincludingWired,FastCompany,MashableandEngadget.Steveisactiveontheconference
circuit,mostrecentlyspeakingatE3andSXSW.
StevehasbeenwithRazorfishfor11yearsandhashadthepleasureofworkingwithavarietyof
clients,someofwhichincludeMicrosoft,AT&T,Dell,Audi,Delta,Kraft,UPSandCoca-Cola.
xii
www.it-ebooks.info
Acknowledgments
Manypeoplehaveprovidedassistanceandinspirationaswewrotethisbook.Wewouldfirstliketo
thankourcurrentandpriorcolleaguesatRazorfish:SteveDawson,LukeHamilton,AlexNichols,Dung
TienLeandJimmyMooreforfreelysharingtheirideasandeven,attimes,theircodewithus.Being
surroundedbyknowledgeableandcleverpeoplehashelpedustomakethisbookbetter.Wewouldalso
liketothankouremployersatRazorfish,inparticularJonathanHull,forcreatinganenvironmentwhere
wecanexplorenewdesignconceptsandbleedingedgetechnologyinordertobuildamazing
experiences.
WeareindebtedtoMicrosoft’sKinectforWindowsteamforprovidingusaccesstointernal
buildsaswellasassistanceinunderstandinghowtheKinectSDKworks,especially:RobRelyea,
SheridanJones,BobHeddle,JPWollersheimandMauroGiusti.
Wewouldalsoliketothankthehackers,theacademics,theartistsandthemadmenwho
learnedtoprogramfortheKinectsensorlongbeforetherewasaKinectforWindowsSDKand
subsequentlyfilledtheinternetwithvideoafterinspiringvideoshowingtheversatilityandingenuityof
theKinecthardware.Wewereabletowritethisbookbecausetheylittheway.
–JarrettandJames
xiii
www.it-ebooks.info
INDEX
Skeletontracking,stickfigurecreation(cont.)
WPFinputsystem,122–124
Speech,223
audiocapture(seealsoAudioCapture)
acousticechocancelation,240–241
noisesuppression,238–240
soundstream,231–238
directionalmicrophone(seeBeamtracking)
microphonearray(seeMicrophonearray)
Speechrecognition
CreateAudioSourcemethod,247
Crosshairsusercontrol,245
eventhandlers,251
GrammarBuilder,252
HandTracking,249
InterpretCommandsmethod,252
KinectSensorand
SpeechRecognitionEngine,248
LaunchAsMTAmethod,247
MainWindowclassimplementation,246
MainWindowXAML,246
PutThatThereapplication,245
StartSpeechRecognitionmethod,250
Stylusdevice,122
T
U
Unity3D,288
W, X, Y, Z
Wavegesture
constants,179
DetectionClass,180
detectionmethodology,177
helpermethods,181–183
neutralzone,178
TrackWavemethod,180–181
userwaving,177
WaveGestureState,178
WaveGestureTracker,178
WavePosition,178
WPFinputsystem
API,122
component,123
controls,124
InputManagerobject,123
jointpositions,124
singlepixelpointlocation,124
touchinput,123
Touchdevice,122
306
www.it-ebooks.info