温馨提示:本站仅提供公开网络链接索引服务,不存储、不篡改任何第三方内容,所有内容版权归原作者所有
AI智能索引来源:http://www.ibm.com/think/topics/vision-language-models
点击访问原文链接

What Are Vision Language Models (VLMs)? | IBM

WelcomeOverviewMachine learning typesMachine learning algorithmsStatistical machine learningLinear algebra for machine learningUncertainty quantificationBias variance tradeoffBayesian StatisticsSingular value decompositionOverviewFeature selectionFeature extractionVector embeddingLatent spacePrincipal component analysisLinear discriminant analysisUpsamplingDownsamplingSynthetic dataData leakageOverviewLinear regressionLasso regressionRidge regressionState space modelTime seriesAutoregressive modelOverviewDecision treesK-nearest neighbors (KNNs)Naive bayesRandom forestSupport vector machineLogistic regressionOverviewBoostingBaggingGradient boostingGradient boosting classifierOverviewTransfer learningOverviewOverviewK means clusteringHierarchical clusteringA priori algorithmGaussian mixture modelAnomaly detectionOverviewCollaborative filteringContent based filteringOverviewReinforcement learning human feedbackOverviewOverviewBackpropagationEncoder-decoder modelRecurrent neural networksLong short-term memory (LSTM)Convolutional neural networksOverviewAttention mechanismGrouped query attentionPositional encodingAutoencoderMamba modelGraph neural networkOverviewGenerative modelGenerative AI vs. predictive AIOverviewReasoning modelsSmall language modelsInstruction tuningLLM parametersLLM temperatureLLM benchmarksLLM customizationDiffusion modelsVariational autoencoder (VAE)Generative adversarial networks (GANs)OverviewVision language modelsTutorial: Build an AI stylistTutorial: Multimodal AI queries using LlamaTutorial: Multimodal AI queries using PixtralTutorial: Automatic podcast transcription with GraniteTutorial: PPT AI image analysis answering systemOverviewGraphRAGTutorial: Build a multimodal RAG system with Docling and GraniteTutorial: Evaluate RAG pipline using RagasTutorial: RAG chunking strategiesTutorial: Graph RAG using knowledge graphsTutorial: Inference scaling to improve multimodal RAGOverviewVibe codingVisit the 2025 Guide to AI AgentsLLM trainingOverviewLoss functionTraining dataModel parametersGradient descentStochastic gradient descentHyperparameter tuningLearning rateOverviewParameter efficient fine tuning (PEFT)LoRATutorial: Fine tuning Granite model with LoRARegularizationFoundation modelsOverfittingUnderfittingFew shot learningZero shot learningKnowledge distillationMeta learningData augmentationCatastrophic forgettingOverviewScikit-learnXGboostPyTorchOverviewAI lifecyleAI inferenceModel deploymentMachine learning pipelineData labelingModel risk managementModel driftAutoMLModel selectionFederated learningDistributed machine learningAI stackOverviewNatural language understandingOverviewSentiment analysisTutorial: Spam text classifier with PyTorchMachine translationOverviewInformation retrievalInformation extractionTopic modelingLatent semantic analysisLatent Dirichlet AllocationNamed entity recognitionWord embeddingsBag of wordsIntelligent searchSpeech recognitionStemming and lemmatizationText summarizationConversational AIConversational analyticsNatural language generationOverviewImage classificationObject detectionInstance segmentationSemantic segmentationOptical character recognitionImage recognitionVisual inspectionRina Diane CaballarCole Strykerartificial intelligence (AI)computer visionnatural language processing (NLP)large language models (LLMs)machine learningmultimodal AItext embeddingsneural networktransformer modelfoundation modelsgenerative pretrained transformer (GPT)embeddingsattention mechanismvector embeddingsdeep learningconvolutional neural networksIBM Privacy Statementzero-shotmasked language modelingGenerative modeldiffusion modelsfine-tuningImageNetCOCOLAIONGo to episodepredictive maintenanceagentic AIAI agentsMixture ofExperts (MoE)Google GeminiGPT-4obenchmarksOpenVLM LeaderboardMathVistaMMBenchoptical character recognition (OCR)MMMUMM-VetOCRBenchVQAGQAquestion answeringOK-VQAScienceQATextVQAVLMEvalKitLMMs-Evalrisks of AIbiasLiveXivdynamic benchmarkAI hallucinationsSpotifyApple PodcastsFind more episodesGuide Start realizing ROI: A practical guide to agentic AI Learn how to scale agentic AI for measurable ROI across your enterprise. This playbook outlines the top barriers that limit impact, how to effectively measure ROI and a practical framework to drive successful, enterprise-wide adoption. Get the guideIBV Report The enterprise in 2030: Engineered for perpetual innovation Discover our five predictions about what will define the most successful enterprises in 2030 and the steps leaders can take to gain an AI-first advantage. Read the reportTraining Take your gen AI skills to the next level Learn fundamental concepts and build your skills with hands-on labs, courses, guided projects, trials and more. Learn generative AIGuide Put AI to work: Driving ROI with gen AI Want to get a better return on your AI investments? Learn how scaling gen AI in key areas drives change by helping your best minds build and deliver innovative new solutions. Read the guideReport From AI projects to profits: How agentic AI can sustain financial returns Learn how organizations are shifting from launching AI in disparate pilots to applying AI to drive transformation at the core. Read the reportTechsplainers Podcast Generative AI explained Techsplainers by IBM breaks down the essentials of gen AI, from key concepts to real‑world use cases. Clear, quick episodes help you learn the fundamentals fast. Listen nowGuide The CEO's guide to generative AI Learn how CEOs can balance the value generative AI can create against the investment it demands and the risks it introduces. Read the guideTraining watsonx® Developer Hub Explore essential tools and resources to accelerate your next project. Get started and discover the full range of supported models available from IBM. Get startedReport The truth about successful generative AI Uncover the benefits of AI platforms that enable foundation model customization through technology, processes and best practices to help you easily operationalize the gen AI lifecycle. Read the reportAI models Explore IBM Granite IBM Granite® is our family of open, performant, and trusted AI models designed for business and optimized to scale your AI applications. Explore models for language, code, time series and guardrails. Meet GraniteEbook How to choose the right foundation model Learn how to select the most suitable AI foundation model for your use case. Read the ebookGuide How to thrive in this new era of AI with trust and confidence Dive into the 3 critical elements of a strong AI strategy: creating a competitive edge, scaling AI across the business and advancing trustworthy AI. Read the guideExplore watsonx OrchestrateExplore AI solutionsExplore AI servicesExplore watsonx OrchestrateExplore watsonx.aiAn Introduction to Vision-Language ModelingDeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal UnderstandingModel InformationThe Llama 3 Herd of ModelsNVLM: Open Frontier-Class Multimodal LLMs

智能索引记录