设计一个vlookup函数的使用方法,要求实现统计单词个数的功能。 例如:“ ab cd 132# dasf da3sfa %assd dasfsa987*

Access denied | artigos.netsaber.com.br used Cloudflare to restrict access
Please enable cookies.
What happened?
The owner of this website (artigos.netsaber.com.br) has banned your access based on your browser's signature (44ddf156b68f6a5b-ua98).(PDF) Sparse Domain Adaptation in a Good...
See all >10 References
10.64Université Jean Monnet17.94Université Jean MonnetDiscover the world's research15+ million members118+ million publications700k+ research projects
Sparse Domain Adaptation in a Good Similarity-BasedProjection SpaceEmilie Morvant, Amaury Habrard, St?ephane AyacheTo cite this version:Emilie Morvant, Amaury Habrard, St?ephane Ayache. Sparse Domain Adaptation in a GoodSimilarity-Based Projection Space. Workshop at NIPS 2011: Domain Adaptation Workshop:Theory and Application, Dec 2011, Grenade, Spain. &hal-&HAL Id: hal-https://hal.archives-ouvertes.fr/hal-Submitted on 21 Dec 2011HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.L’archive ouverte pluridisciplinaire HAL, estdestin?ee au d?ep^ot et `a la diffusion de documentsscientifiques de niveau recherche, publi?es ou non,?emanant des ?etablissements d’enseignement et derecherche fran,cais ou ?etrangers, des laboratoirespublics ou priv?es.
Sparse Domain Adaptation in a GoodSimilarity-Based Projection Space*Emilie Morvant, St?ephane AyacheAix-Marseille Univ, LIF-QarmaCNRS, UMR , Marseille, Francefirstname.lastname@lif.univ-mrs.frAmaury HabrardUniv of St-Etienne, Laboratoire Hubert CurienCNRS, UMR , St-Etienne, Franceamaury.habrard@univ-st-etienne.frWe address domain adaptation (DA) for binary classification in the challenging case where no targetlabel is available. We propose an original approach that stands in a recent framework of Balcan etal. [1] allowing to learn linear classifiers in an explicit projection space based on good similarityfunctions that may be not symmetric and not positive semi-definite (PSD). Following the DA frame-work of Ben-David et al. [2], our method looks for a relevant projection space where the source andtarget distributions tend to be close. This objective is achieved by the use of an additional regularizermotivated by the notion of algorithmic robustness proposed by Xu and Mannor [3]. Our approach isformulated as a linear program with a 1-norm regularization leading to sparse models. We providea theoretical analysis of this sparsity and a generalization bound. From a practical standpoint, toimprove the efficiency of the method we propose an iterative version based on a reweighting schemeof the similarities to move closer the distributions in a new projection space. Hyperparameters andreweighting quality are controlled by a reverse validation process. The evaluation of our approachon a synthetic problem and real image annotation tasks shows good adaptation performances.This work will appear in “IEEE International Conference on Data Mining (ICDM) 2011” [4].NotationsLet X?Rdbe the input space of dimension dand Y={-1,+1}the label set. A domain is aprobability distribution over X×Y. In a DA framework [2, 5], we have a source domain representedby a distribution PSand a target domain represented by a somewhat different distribution PT.DSand DTare the respective marginal distributions over X. A learning algorithm is provided with aLabeled Source sample LS ={(xi, yi)}dli=1 drawn i.i.d. from PS, and an unlabeled Target SampleT S ={xj}dtj=1 drawn i.i.d. from DT. Let h:X→Ybe an hypothesis function. The expected sourceerror of hover PSis the probability that hcommits an error: errS(h) = E(x,y)~PSL01?h,(x, y)?,where L01(h,(x, y))=1 if h(x)6=yand zero otherwise, it is the 0-1loss function. The target errorerrTover PTis defined in a similar way. ^errSand ^errTare the empirical errors. An hypothesisclass His a set of hypothesis. The DA objective is then to find a low target error hypothesis.Domain Adaptation FrameworkWe consider the DA framework proposed by Ben-David et al. [2] allowing us to upper bound thetarget error errTaccording to the source error and the divergence between the domain distributions,?h∈ H,errT(h)≤errS(h) + 12dH?H(DS, DT) + ν. (1)The last term νcan be seen as a kind of adaptation ability measure of Hfor the DA prob-lem considered and corresponds to the error of the best joint hypothesis over the two domains:*Work partially supported by the ANR VideoSense project (ANR-09-C ORD-0 26) and the PASCAL2 networkof excellence.1
ν= argminh∈H errS(h) + errT(h). The second term dH?H(DS, DT)is called the H?H-distancebetween the two domain marginal distributions. This measure is actually related to Hand an inter-esting point is that when the VC-dimension of His finite, we can estimate dH?Hfrom finite samplesby looking for the best classifier able to separate LS from T S. This bound suggests that one possiblesolution for a good DA is to look for a relevant data projection space where both the H?H-distanceand the source error of a classifier are low (two aspects a priori necessary for a good DA [6]).Learning with Good Similarity FunctionsInstead of working on the implicit high dimensional projection space induced by classical SVM’skernels (that may be strongly limited by symmetry and PSD requirements), we investigate a moreflexible and intuitive similarity-based representation proposed recently by Balcan et al. [1] for learn-ing with a good similarity function fulfilling the following definition.Definition 1 ([1]).A similarity function is any pairwise function K:X×X→[-1,1].Kis an(?,γ,τ)-good similarity function for a learning problem Pif there exists a (random) indicatorfunction R(x)defining a set of reasonable points such that the following conditions hold:(i) A 1-?probability mass of examples (x, y)satisfy E(x0,y0)~P?yy0K(x,x0)|R(x0)=1?≥γ,(ii) Prx0[R(x0)=1] ≥τ.Def.1 requires that a large proportion of examples is on average more similar, w.r.t the margin γ, tothe reasonable points of the same class than to the reasonable points of the opposite class. It includesall valid kernels as well as some non-PSD similarities and is thus a generalization of kernels [1].Given Kan (?,γ,τ )-good similarity function, LS a sample of dllabeled points and Ra set of -enough - dupotential reasonable points (landmarks), the conditions of Balcan et al. are sufficient tolearn a low-error linear binary classifier (a SF classifier) in a φR-space defined by the mapping φR,which projects a point in the explicit space of the similarities to the landmarks in Rsuch that,φR:?X→Rdux7→ hK(x,x01), . . . , K(x,x0du)i.The low-error SF classifier hcan be learned by solving the following linear problem in the φR-space,minα1dldlXi=1L?g, (xi, yi)?+λkαk1,with L?g, (xi, yi)?= [1 -yig(x)]+and g(x)=duXj=1αjK(x,x0j),(2)where [1 -z]+=max(0,1-z)is the hinge loss. Finally, we have h(x) = sign[g(x)].Solving (2) not only minimizes the expected source error but also defines a relevant projection spacefor a given problem, since landmarks associated with a null weight in the solution αwill not beconsidered. In this work we propose to add a new regularization term on αin order to constrain theexplicit φR-space to move closer the two distributions and to tend to decrease the H?H-distance.Contribution for Domain Adaptation with Good Similarity FunctionsThe objective here is to define a regularizer that tends to make the source and target sample indistin-guishable. For this purpose, we have investigated the algorithmic robustness notion proposed by Xuand Mannor [3] based on the following property: “If a testing sample is similar to a training samplethen the testing error is close to the training error”. This can be formalized as follows: If for anytest point close to a training point of the same class the deviation between the losses of each point islow for a learned model, then this model has some generalization guarantees (even if the robustnessis true for only a subpart of the training sample). This result actually assumes that the test and traindata are drawn from the same distribution and is thus not valid in a classical DA scenario.However, we propose to adapt this principle for making the target sample similar to the source onewhich is coherent with the minimization of the divergence dH?H: For any pair (xs,xt)of closesource and target instances of the same class y, the deviation between the losses of xsand xtis low.By considering the hinge loss of (2), this leads us to the following term to minimize for such a pair:??L(g, (xs, y)) -L(g , (xt, y))??≤??(tφR(xs)-tφR(xt)) diag(α)??1, where tφR(·)is the transposedvector of φR(·)and diag(α)is the diagonal matrix with αas main diagonal. Given any pair set2
CST of close source-target examples, we then propose to consider this term for all the pairs as anadditional regularizer on α, weighted by a parameter β, to the Problem (2) of Balcan et al. Ourglobal optimization problem (3) can be then formulated as the following linear program,???????minα1dldlXi=1L?g, (xi, yi)?+λkαk1+βX(xs,xt)∈CSTk(tφR(xs)-tφR(xt)) diag(α)k1,with L?g, (xi, yi)?= [1-yig(x)]+and g(x)=Pduj=1 αjK(x,x0j).(3)This problem is defined with a 1-norm regularization leading generally to very sparse models. Wehave in fact proved in the following lemma that the sparsity of the obtained models depends alsoon a quantity BR= minx0j∈R?max(xs,xt)∈CST |K(xs,x0j)-K(xt,x0j)|?related to the deviationbetween coordinates in the considered φR-space. In other words, when the domains are far fromeach other, i.e. the task is hard, BRtends to be high which can increase the sparsity.Lemma 1. For any λ & 0,β & 0, any set CST s.t. BR&0, if α*is optimal, then kα*k1≤1βBR+λ.Moreover, according to the robustness framework [3] applied on the source domain, and from theDA bound (1), we can prove the following generalization bound for the expected target domain error.Theorem 1. Problem (3) defines a procedure (2Mη,NηβBR+λ)robust on the source domain, whereNη= maxxa,xb~DS,ρ(xa,xb)≤ηktφR(xa)-tφR(xb)k∞,η & 0,Mηis the η-covering number of X. Thus for everyhin the SF classifiers hypothesis class, for any δ & 0, with probability at least 1-δ,errT(h)≤^errS(h) + NηβBR+λ+s4Mηln 2 + 2 ln 1δdl+12dH?H(DS, DT) + ν.From a practical standpoint, a critical issue is the estimation of the hyperparameters λand βandthe definition of the pair set CST which is difficult a priori since we do not have any target labelinformation. We propose to tackle these problems with the help of a reverse validation method.Reverse validation and Iterative procedureWe choose the different parameters of our method by following the principle of reverse validationof Zhong et al. [7]. This principle is illustrated on Fig. 1 and consists in learning a so-called reverseclassifier hr, from the target sample sef-labeled by a classifier h(inferred with Pb. (3)). We evaluate^errS(hr)on the source sample and heuristically ^errT(hr)on the self-labeled target sample (both bycross-validation). We then obtain an heuristic estimate of ν(of the DA bound (1)) for hrsuch that^ν= ^errS(hr) + ^errT(hr). We select the parameters and pair set CST minimizing ^ν. In this context,^νcan be seen as a quality measure of the φR-space found: If the two domains are sufficiently closeand related then the reverse classifier should perform well on the source domain ([8]).However, considering all the possible pairs for CS T remains clearly intractable. In practice, we selectonly a limited number of examples for building CST . We compensate the possible loss of informationby an heuristic iterative procedure still allowing to move closer the two distributions. Suppose that ata given iteration l, with a similarity Kl, we obtain new weights αlby solving (3). Our regularizationterm can actually be seen as a L1-distance in a new φRl+1-space: k(tφRl(xs)-tφRl(xt)) diag(α)k1=k(tφRl+1(xs)-tφRl+1(xt))k1.φRl+1 corresponding to the mapping defined by the similarity Kl+1obtained from Klby a conditional reweighting to each landmark: ?x0j∈R, Kl+1(x,x0j)=αljKl(x,x0j)(Kl+1 do not need to be PSD nor symmetric according to Def.1). We then iterate the process inthe new φRl+1-space and we stop at iteration lwhen ^νl+1 has reached a convergence point or hasincreased. The main steps of the recursive approach are summarized on Algorithm 1.Experimental evaluationOur method DASF has been evaluated on a toy problem and on real image annotation tasks andcompared with SF method of Pb (2) and SVM with no adaptation, the semi-supervised Transductive-SVM (T-SVM) [9] and the iterative DA algorithm DASVM [8]. We used a Gaussian kernel for the3
Algorithm 1 DASF (Domain Adaptation with Similarity Functions)Input: similarity function K, set R, samples LS and T SOutput: classifier hDASFh0(·)←sign h1|R|P|R|j=1 K(·,x0j)i;K1←K;l←1;while The stopping criterion is not verified doαl←Solve Pb. (3) with Kl,CST and hyperparameters being selected by reverse vKl+1 ←Update Klaccording to αl; Update R;l+ +;end whilereturn hDASF (·) = sign hPx0j∈RαljKl(·,x0j)i;last three methods and a re-normalization of this kernel as a non PSD similarity function for SF andDASF (see [10]). All the averaged results show that DASF provide better and sparser models ingeneral. As an illustration, Tab.1 gives results for a real image annotation task, where the sourceimages are extracted from the PascalVOC’07 corpus and the target ones from the TrecVid’07 videocorpus. Moreover, the iterative procedure always tend to decrease the distribution divergence withthe iterations [4]. Among all the possible perspectives, we notably aim to consider some few targetlabels to help the search of a relevant projection space.+++++hhr1 Learning of 2 Auto Labeling 3 Learning ofh with (3) from LS U TSof TS with hthe reverse classifier h
with (2) from TS auto-labeledr4 Evaluationof the reverse classifier hon LS by cross-validationr+++++-- ---LS TS TSTS-----LS+++++-- ---Figure 1: The reverse validation. 1:Learning hwith (3). 2:Auto-labeling the target sample with h.3:Learning hron the auto-labeled target samplewith (2). 4:Evaluation of hron LS.CONC EPT B OAT BU S CAR MO NITO R PERS ON PL ANE AVG.SVM 0.56 0.25 0.43 0.19 0.52 0.32 0.38MODEL SIZE 351 476 1096 698 951 428 667SF 0.49 0.46 0.50 0.34 0.45 0.54 0.46MODEL SIZE 214 224 176 246 226 178 211T-SVM 0.56 0.48 0.52 0.37 0.46 0.61 0.50MODEL SIZE 498 535 631 741 1024 259 615DASVM 0.52 0.46 0.55 0.30 0.54 0.52 0.48MODEL SIZE 202 222 627 523 274 450 383DASF 0.57 0.49 0.55 0.42 0.57 0.66 0.54MODEL SIZE 120 130 254 151 19 7 113Table 1: The results obtained on theTrecVid target domains according to the F-measure. AVG. corresponds to the averagedresults.References[1] M.-F. Balcan, A. Blum, and N. Srebro. Improved guarantees for learning via similarity functions. InProceedings of COLT, 2008.[2] S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J.W. Vaughan. A theory of learningfrom different domains. Machine Learning Journal, 79(1-2):151–175, 2010.[3] H. Xu and S. Mannor. Robustness and generalization. In Proceedings of COLT, 2010.[4] E. Morvant, A. Habrard, and S. Ayache. Sparse domain adaptation in projection spaces based on goodsimilarity functions. In Proceedings of ICDM, 2011.[5] Y. Mansour, M. Mohri, and A. Rostamizadeh. Domain adaptation: Learning bounds and algorithms. InProceedings of COLT, 2009.[6] S. Ben-David, T. Lu, T. Luu, and D. Pal. Impossibility theorems for domain adaptation. JMLR W&CP,9:129–136, 2010.[7] E. Zhong, W. Fan, Q. Yang, O. Verscheure, and J. Ren. Cross validation framework to choose amongstmodels and datasets for transfer learning. In Proceedings of ECML-PKDD, 2010.[8] L. Bruzzone and M. Marconcini. Domain adaptation problems: A DASVM classification technique anda circular validation strategy. IEEE Trans. Pattern Anal. Mach. Intell., 32(5), 2010.[9] T. Joachims. Transductive inference for text classification using support vector machines. In Proceedingsof ICML, 1999.[10] E. Morvant, A. Habrard, and S. Ayache. On the usefulness of similarity based projection spaces fortransfer learning. In Proceedings of Similarity-Based Pattern Recognition workshop (SIMBAD), 2011.4
ArticleFull-text availableSep 2011Conference PaperFull-text availableDec 2011Conference PaperFull-text availableSep 2010ArticleFull-text availableMay 2010ArticleFull-text availableJan 2010Conference PaperJan 2008ArticleMay 2010ArticleMay 2010ArticleFeb 2009ArticleAug 2001ProjectPrivate Profile[...]We have developed structured prediction methods for predicting output objects with graphical structure, which may be given as prior information, drawn randomly, or inferred during learning the mode…& [more]ArticleSeptember 2011b&talk: http://videolectures.net/simbad2011_morvant_transfer/ , 16 pages Conference PaperDecember 2011We address the problem of domain adaptation for binary classification which arises when the distributions generating the source learning data and target test data are somewhat different. We consider the challenging case where no target labeled data is available. From a theoretical standpoint, a classifier has better generalization guarantees when the two domain marginal distributions are... [Show full abstract]Technical ReportDecember 2014Conference PaperNovember 2011This paper describes our participation to the TRECVID 2011 challenge [1]. This year, we focused on a stacking fusion with Domain Adaptation algorithm. In machine learning, Domain Adaptation deals with learning tasks where the train and the test distributions are supposed related but different. We have implemented a classical approach for concept detection using individual features (low-level... [Show full abstract]Conference PaperJanuary 2011Similarity functions are widely used in many machine learning or pattern recognition tasks. We consider here a recent framework for binary classification, proposed by Balcan et al., allowing to learn in a potentially non geometrical space based on good similarity functions. This framework is a generalization of the notion of kernels used in support vector machines in the sense that allows one... [Show full abstract]Progress in understanding key aerosol issues - ScienceDirect
Purchase PDFPurchaseExportJavaScript is disabled on your browser. Please enable JavaScript to use all the features on this page., January 2010, Pages 120-127Author links open overlay panelShow moreAbstractThe 6th FWP SARNET project launched a set of studies to enhance understanding and predictability of relevant-risk scenarios where uncertainties related to aerosol phenomena were still significant: retention in complex structures, such as steam generator by-pass SGTR sequences or cracks in concrete walls of an over-pressurised containment, and primary circuit deposit remobilization, either as vapours (revaporisation) or aerosols (resuspension). This paper summarizes the major advances achieved.Progress has been made on aerosol scrubbing in complex structures. Models based on empirical data (ARISG) and improvements to previous codes (SPARC) have been proposed, respectively, for dry and wet aerosol retention, but, further development and validation remains, as was noted during the ARTIST international project and potential successors. New CFD models for particle-turbulence interactions have been developed based on random walk stochastic treatments and have shown promise in accurately describing particle deposition rates in complex geometries. Aerosol transport in containment concrete cracks is fairly well understood, with several models developed but validation was limited. Extension of such validation against prototypic data will be feasible through an ongoing joint experimental program in the CEA COLIMA facility under the 6th Framework PLINIUS platform.Primary deposit revaporisation has been experimentally demonstrated on samples from the Phebus-FP project. Data review has pinpointed variables affecting the process, particularly temperature. Available models have been satisfactorily used to interpret separate-effect tests, but performing integral experiments, where revaporisation is likely combined with other processes, still pose a difficult challenge. Further experimental data as well as modelling efforts seem to be necessary to get a full understanding. Resuspension, sometimes referred to as mechanical remobilization, has been recently addressed in SARNET and although a set of models were already available in the literature (i.e., Rock'n Roll model, CESAR, ECART), further work is needed to extend current capabilities to multi-layer deposits and to produce simplified, but sufficiently accurate, models. A major remaining uncertainty is the particle-to-particle/wall adhesion and its dependence on microscale roughness. Data from the previous EU STORM project have been retrieved and further experiments designed for code validation are being used to benchmark the models.KeywordsSevere accidentsSource termAerosolsChoose an option to locate/access this article:Check if you have access through your login credentials or your institution.orRecommended articlesCiting articles (0)

我要回帖

更多关于 vlookup函数的使用方法 的文章

 

随机推荐