-
UVTM: Universal Vehicle Trajectory Modeling With ST Feature Domain Generation
Yan LinJilin HuShengnan GuoBin YangChristian S. JensenYoufang LinHuaiyu Wan
Keywords:TrajectoryRoadsGlobal Positioning SystemComputational modelingAdaptation modelsEstimationSpatiotemporal phenomenaMultitaskingTrainingFeature extractionDomain FeaturesTrajectory ModelVehicle TrajectoryReal-world DatasetsUniversal ModelTrajectory PredictionTrajectory FeaturesTrajectory DatasetEstimated Travel TimeTraining SetComputational EfficiencyValidation SetTraining TimeTravel TimeMean Absolute ErrorRoad NetworkMultiple TasksLearnable ParametersLanguage ModelSpatiotemporal CharacteristicsRoad SegmentsMean Absolute Percentage ErrorDeparture TimeSelf-supervised LearningMatching ModelPre-trained Language ModelsMulti-task LearningTime TrajectoriesLocal RoadsSimilar TrajectoriesVehicle GPS trajectoryspatiotemporal data miningpre-training and fine-tuningself-supervised learning
Abstracts:Vehicle movement is frequently captured in the form of GPS trajectories, i.e., sequences of timestamped GPS locations. Such data is widely used for various tasks such as travel-time estimation, trajectory recovery, and trajectory prediction. A universal vehicle trajectory model could be applied to different tasks, removing the need to maintain multiple specialized models, thereby reducing computational and storage costs. However, creating such a model is challenging when the integrity of trajectory features is compromised, i.e., in scenarios where only partial features are available or the trajectories are sparse. To address these challenges, we propose the Universal Vehicle Trajectory Model (UVTM), which can effectively adapt to different tasks without excessive retraining. UVTM incorporates two specialized designs. First, it divides trajectory features into three distinct domains. Each domain can be masked and generated independently to accommodate tasks with only partially available features. Second, UVTM is pre-trained by reconstructing dense, feature-complete trajectories from sparse, feature-incomplete counterparts, enabling strong performance even when the integrity of trajectory features is compromised. Experiments involving four representative trajectory-related tasks on three real-world vehicle trajectory datasets provide insight into the performance of UVTM and offer evidence that it is capable of meeting its objectives.
-
Towards DS-NER: Unveiling and Addressing Latent Noise in Distant Annotations
Yuyang DingDan QiaoJuntao LiJiajie XuPingfu ChaoXiaofang ZhouMin Zhang
Keywords:AnnotationsNoise measurementNoiseTrainingChatbotsLarge language modelsData miningNearest neighbor methodsNamed entity recognitionData modelsLatent NoiseReal-world DatasetsLanguage ModelExternal ResourcesNamed Entity RecognitionAnnotation MethodsRule-based MethodsFalse NegativeSelection MethodNegative SamplesAbility Of The ModelK-nearest NeighborPrecision And RecallNoisy DataTruth LabelsAnnotated DatasetModel ConfidenceTraining InstancesEntity TypesNatural Language Processing TasksNoisy LabelsEarly Stage Of TrainingPre-trained Language ModelsNegative InstancesReliable SetLabel NoiseSupervision MethodsDeep Neural NetworkDistantly supervised learningnamed entity recognitionnoise measurement
Abstracts:Distantly supervised named entity recognition (DS-NER) has emerged as a cheap and convenient alternative to traditional human annotation methods, enabling the automatic generation of training data by aligning text with external resources. Despite the many efforts in noise measurement methods, few works focus on the latent noise distribution between different distant annotation methods. In this work, we explore the effectiveness and robustness of DS-NER by two aspects: (1) distant annotation techniques, which encompasses both traditional rule-based methods and the innovative large language model supervision approach, and (2) noise assessment, for which we introduce a novel framework. This framework addresses the challenges by distinctly categorizing them into the unlabeled-entity problem (UEP) and the noisy-entity problem (NEP), subsequently providing specialized solutions for each. Our proposed method achieves significant improvements on eight real-world distant supervision datasets originating from three different data sources and involving four distinct annotation techniques, confirming its superiority over current state-of-the-art methods.
-
Top-K Representative Search for Comparative Tree Summarization
Yuqi ChenXin HuangBilian Chen
Keywords:VisualizationOntologiesTopologyMachine learningHardwareGreedy algorithmsData scienceData integrationUsabilityPhysicsSingle TreeSubtreeNode WeightsHellinger DistanceComputational EfficiencyHierarchical StructureSum ScoreResearch PapersTree StructureBaseline MethodsFive-year PeriodHierarchical TreeCommon TreeDisease OntologyNodes In SetTree summarizationtop-k diversificationhellinger distance
Abstracts:Data summarization aims at utilizing a small-scale summary to represent massive datasets as a whole, which is useful for visualization and information sipped generation. However, most existing studies of hierarchical summarization only work on one single tree by selecting $k$k representative nodes, which neglects an important problem of comparative summarization on two trees. In this paper, given two trees with the same topology structure and different node weights, we aim at finding $k$k representative nodes, where $k_{1}$k1 nodes summarize the common relationship between them and $k_{2}$k2 nodes highlight significantly different subtrees meanwhile satisfying $k_{1}+k_{2}=k$k1+k2=k. To optimize summarization results, we introduce a scaling coefficient for balancing the summary view between two subtrees in terms of similarity and difference. Additionally, we propose a novel definition based on the Hellinger distance to quantify the node distribution difference between two subtrees. We present a greedy algorithm SVDT to find high-quality results with approximation guaranteed in an efficient way. Furthermore, we explore an extension of our comparative summarization to handle two trees with different structures. Extensive experiments demonstrate the effectiveness and efficiency of our SVDT algorithm against existing summarization competitors.
-
Temporal and Spatial Analysis in Early Sepsis Prediction via Causal Disentanglements
Qiang LiDongchen LiWeizhi NieHe JiaoZhenhua WuAnan Liu
Keywords:SepsisData modelsPredictive modelsHospitalsDiseasesPhysiologyMIMICsCorrelationTransformersMortalityEarly PredictionSepsis PredictionTime SeriesPrediction AccuracyModel PerformanceTransformerLactic AcidPositive SamplesLatent VariablesPersonal InformationPlatelet CountClinical IndicatorsNegative SamplesJoint EffectCausal ModelSequential Organ Failure AssessmentSerializedClinical NotesDisease-related FactorsSystemic Inflammatory Response SyndromeTime StepConduct Ablation ExperimentsClinical TextMultilayer PerceptronPointwise ConvolutionMean Arterial PressureMissing RateSeptic ShockArterial PressureSepsisMIMIC-IVearly predictioncausal disentanglement
Abstracts:Sepsis is one of the main causes of death in ICU patients, and accurate and stable early prediction is essential for clinical intervention. Existing methods mostly rely on traditional time series models (e.g., LSTM, Transformer) or clinical scoring criteria (e.g., SOFA, qSOFA), but face two major challenges: 1) spurious correlations in the data affect the robustness of the model; 2) Lack of modeling the underlying causal relationships in the data space. We propose a Serialized Causal Disentanglement Model (SCDM) that decouples latent variables into sepsis-related factors ($u$u), other disease-related factors ($v$v), and irrelevant confounders ($s$s ). Based on the MIMIC-IV v2.2 dataset (3,511 positive samples and 17,538 negative samples), SCDM took patient clinical indicators, personal information, and clinical notes as input, and achieved an AUC of 0.765-0.928in the prediction task 48 to 0 hours before the onset of sepsis. The performance is significantly better than the baseline models (e.g., Transformer's 0.662-0.910, MGP-AttTCN's 0.692-0.913). Experiments show that optimizing the time window (5 hours of continuous observation) and variable selection (45 key indicators) can improve the performance of the model. The effectiveness of causal unwinding is verified by the visualization of Grad CAM and t-SNE, key clinical indicators such as platelet count, lactic acid, and respiratory rate are further identified to provide interpretable decision support for doctors. Our study provides a high-precision and interpretable causal disentanglement framework for early prediction of sepsis, which is expected to promote the development of intelligent diagnosis and treatment in the ICU.
-
ST-LLM+: Graph Enhanced Spatio-Temporal Large Language Models for Traffic Prediction
Chenxi LiuKethmi Hirushini HettigeQianxiong XuCheng LongShili XiangGao CongZiyue LiRui Zhao
Keywords:Time series analysisPredictive modelsForecastingLarge language modelsAdaptation modelsData modelsComputational modelingTrainingElectronic mailAttention mechanismsSpatiotemporal ModelTraffic PredictionTraffic Prediction ModelLarge Language ModelsLearnable ParametersFewer ParametersUrban NetworkAttention LayerGraph AttentionGlobal DependenciesSpatio-temporal DependenciesTraffic DatasetTime SeriesConvolutional NetworkConvolutional Neural NetworkTime Series DataAttention MechanismSpatial DependenceAnomaly DetectionMemory UsageGraph Convolutional NetworkTraffic DataLocal DependenceCapture ComplexToken EmbeddingAutoregressive Integrated Moving AverageGraph Neural NetworksGraph ConvolutionTime Series PredictionSpatiotemporal DataTraffic predictionlarge language modelsspatio-temporal data
Abstracts:Traffic prediction is a crucial component of data management systems, leveraging historical data to learn spatio-temporal dynamics for forecasting future traffic and enabling efficient decision-making and resource allocation. Despite efforts to develop increasingly complex architectures, existing traffic prediction models often struggle to generalize across diverse datasets and contexts, limiting their adaptability in real-world applications. In contrast to existing traffic prediction models, large language models (LLMs) progress mainly through parameter expansion and extensive pre-training while maintaining their fundamental structures. In this paper, we propose ST-LLM+, the graph enhanced spatio-temporal large language models for traffic prediction. Through incorporating a proximity-based adjacency matrix derived from the traffic network into the calibrated LLMs, ST-LLM+ captures complex spatio-temporal dependencies within the traffic network. The Partially Frozen Graph Attention (PFGA) module is designed to retain global dependencies learned during LLMs pre-training while modeling localized dependencies specific to the traffic domain. To reduce computational overhead, ST-LLM+ adopts the LoRA-augmented training strategy, allowing attention layers to be fine-tuned with fewer learnable parameters. Comprehensive experiments on real-world traffic datasets demonstrate that ST-LLM+ outperforms state-of-the-art models. In particular, ST-LLM+ also exhibits robust performance in both few-shot and zero-shot prediction scenarios. Additionally, our case study demonstrates that ST-LLM+ captures global and localized dependencies between stations, verifying its effectiveness for traffic prediction tasks.
-
STDA: Spatio-Temporal Deviation Alignment Learning for Cross-City Fine-Grained Urban Flow Inference
Min YangXiaoyu LiBin XuXiushan NieMuming ZhaoChengqi ZhangYu ZhengYongshun Gong
Keywords:Urban areasFeature extractionRoadsTrainingSpatiotemporal phenomenaAdaptation modelsAccuracyTransformersRepresentation learningPredictive modelsUrban FlowLarge-scale DatasetsBatch NormalizationReal-world DatasetsStructural AlignmentSpatiotemporal DistributionUrban StructureFlow MapMultiple CitiesNegative TransferInstance NormalizationRoot Mean Square ErrorPredictive PerformanceKnowledge TransferDaily DataMean Absolute ErrorTransfer LearningKnowledge SharingRoad NetworkConvolution KernelAdaptive KernelMean Absolute Percentage ErrorKernel ShapeSource DatasetTransfer Learning MethodTarget DatasetSpatiotemporal RepresentationReal-world ScenariosTraffic FlowSource DomainSpatio-temporal data miningcross-city transfer learningfine-grained urban flow inference (FUFI)
Abstracts:Fine-grained urban flow inference (FUFI) is crucial for traffic management, as it infers high-resolution urban flow maps from coarse-grained observations. Existing FUFI methods typically focus on a single city and rely on comprehensive training with large-scale datasets to achieve precise inferences. However, data availability in developing cities may be limited, posing challenges to the development of well-performing models. To address this issue, we propose cross-city fine-grained urban flow inference, which aims to transfer spatio-temporal knowledge from data-rich cities to data-scarce areas using meta-transfer learning. This paper devises a Spatio-Temporal Deviation Alignment (STDA) framework to mitigate spatio-temporal distribution deviations and urban structural deviations between multiple source cities and the target city. Furthermore, STDA presents a cross-city normalization method that adaptively combines batch and instance normalization to maintain consistency between city-variant and city-invariant features. Besides, we design an urban structure alignment module to align spatial topological differences across cities. STDA effectively reduces distribution and structural deviations among different datasets while avoiding negative transfer. Extensive experiments conducted on three real-world datasets demonstrate that STDA consistently outperforms state-of-the-art baselines.
-
SAGoG: Similarity-Aware Graph of Graphs Neural Networks for Multivariate Time Series Classification
Shun WangYong ZhangXuanqi LinYongli HuQingming HuangBaocai Yin
Keywords:Time series analysisCorrelationGraph neural networksFeature extractionDeep learningData modelsAnalytical modelsSpatiotemporal phenomenaLarge language modelsClassification algorithmsTime SeriesMultivariate Time SeriesMultivariate ClassificationTime Series ClassificationMultivariate Time Series ClassificationDeep LearningClassification TaskOutstanding PerformanceGraph StructureGraph Neural NetworksGraph Neural Network ModelClassification AccuracyClassification PerformanceTime Series DataAverage AccuracyTime Series AnalysisMultivariate DataGraph Convolutional NetworkTime SegmentsTime Series PredictionDynamic Time WarpingChannel CorrelationMultivariate Time Series DataRepresentative Time SeriesNode RepresentationsDynamic CorrelationImprove Classification PerformanceGraph ConvolutionDynamic GraphDependent Time SeriesMultivariate time seriesgraph neural networktime series classification
Abstracts:Multivariate Time Series Classification (MTSC) has important research significance and practical value. Deep learning models have achieved considerable success in addressing MTSC problems. However, a key challenge faced by existing classification models is how to effectively consider the correlations between time series instances and across channels simultaneously, as well as how to capture the dynamic of these inter-channel correlations over time. Current methods often fall short in these aspects: on one hand, they fail to fully account for the combined effects of inter-instance and inter-channel correlations; on the other hand, they largely overlook the dynamic nature of how inter-channel correlations change over time. To address these issues, we propose a novel graph neural network model, called Similarity-Aware Graph of Graphs neural networks (SAGoG), for multivariate time series classification. This model can comprehensively consider the dependencies between channel-level and instance-level time series, it dynamically learns dependency features through graph structure evolution and graph pooling layers. We conduct experiments on the UEA dataset to validate the SAGoG model, and the results demonstrate its outstanding performance in multivariate time series classification tasks.
-
Robust Tensor Completion With Side Information
Yao WangQianxin YiYiyang YangShanxing GaoShaojie TangDi Wang
Keywords:TensorsComplexity theoryMotion picturesRecommender systemsPrincipal component analysisNoiseReviewsPredictive modelsNoise levelData modelsTensor CompletionComplete ModelReal-world DatasetsRecommender SystemsLink PredictionLatent InformationLow-rank TensorHigher-order TensorsUse Of InformationInformation In The FormTrusting RelationshipPrediction ProblemFrobenius NormUser CharacteristicsHelpful InformationMatrix CompletionPerfect InformationPartial ObservationRobust ProblemNuclear NormNoisy InformationTucker DecompositionComplete AlgorithmFeature TensorDiscrete Cosine TransformImage InpaintingTensor DecompositionNoisy FeaturesExplicit FeaturesReal-world ApplicationsRobust tensor completionside informationtransformed t-SVDlink predictionrecommender systems
Abstracts:Although robust tensor completion has been extensively studied, the effect of incorporating side information has not been explored. In this article, we fill this gap by developing a novel high-order robust tensor completion model that incorporates both latent and explicit side information. We base our model on the transformed t-product because the corresponding tensor tubal rank can characterize the inherent low-rank structure of a tensor. We study the effect of side information on sample complexity and prove that our model needs fewer observations than other tensor recovery methods when side information is perfect. This theoretically shows that informative side information is beneficial for learning. Extensive experimental results on synthetic and real data further demonstrate the superiority of the proposed method over several popular alternatives. In particular, we evaluate the performance of our solution based on two important applications, namely, link prediction in signed networks and rating prediction in recommender systems. We show that the proposed model, which manages to exploit side information in learning, outperforms other methods in the learning of such low-rank tensor data. Furthermore, when dealing with varying dimensions, we also design an online robust tensor completion with side information algorithm and validate its effectiveness using a real-world traffic dataset in the supplementary material.
-
RobGC: Towards Robust Graph Condensation
Xinyi GaoHongzhi YinTong ChenGuanhua YeWentao ZhangBin Cui
Keywords:TrainingNoise measurementNoise reductionNoiseOptimizationGraph neural networksRobustnessComputational modelingData modelsComputer scienceCondensationTraining StageInductive ReasoningGraph StructureNoisy EnvironmentsGraph Neural NetworksOriginal GraphLarge GraphsInference StageCorrelation MatrixRandom NoiseStructural OptimizationSingular Value DecompositionNodes In The GraphGraph DataGraph Convolutional NetworkNode FeaturesSelf-supervised LearningHomophilyMetric LearningGraph Neural Network ModelAdversarial AttacksLabel PropagationGraph OptimizationNode RepresentationsNoisy TrainingNoise StructureCondensation ProcessBilevel OptimizationGrid SearchGraph condensationgraph structure learning
Abstracts:The increasing prevalence of large-scale graphs presents a significant challenge for Graph neural networks (GNNs) training due to their computational demands, limiting the applicability of GNNs in various scenarios. In response to this challenge, graph condensation (GC) is proposed as a promising acceleration solution, focusing on generating an informative compact graph that enables efficient training of GNNs while retaining performance. Despite the potential to accelerate GNN training, existing GC methods overlook the quality of large training graphs during both the training and inference stages. They indiscriminately emulate the training graph distributions, making the condensed graphs susceptible to noises within the training graph and significantly impeding the application of GC in intricate real-world scenarios. To address this issue, we propose robust graph condensation (RobGC), a plug-and-play approach for GC to extend the robustness and applicability of condensed graphs in noisy graph structure environments. Specifically, RobGC leverages the condensed graph as a feedback signal to guide the denoising process on the original training graph. A label propagation-based alternating optimization strategy is in place for the condensation and denoising processes, contributing to the mutual purification of the condensed graph and training graph. Additionally, as a GC method designed for inductive graph inference, RobGC facilitates test-time graph denoising by leveraging the noise-free condensed graph to calibrate the structure of the test graph. Extensive experiments show that RobGC is compatible with various GC methods, significantly boosting their robustness.
-
QSTGNN: Quaternion Spatio-Temporal Graph Neural Networks
Ye LiuChaoxiong LinYuchen MouHuaiguang JiangHongmin Cai
Keywords:QuaternionsConvolutionTime series analysisGraph neural networksForecastingCorrelationData miningTransformersPredictive modelsFeature extractionNeural NetworkGraph Neural NetworksSpatio-temporal GraphSpatio-temporal Graph Neural NetworkTime SeriesTime StepConvolution OperationTraffic FlowSpatial DependenceLearning ModuleTemporal DependenciesClimate PredictionTime Series PredictionTemporal ModulationGraph ConvolutionConvolution ModuleImplicit RelationsSpatiotemporal NetworkMultiple GraphsTraffic ForecastingGraph Convolutional NetworkSpatio-temporal PredictionMean Absolute ErrorExplicit StructureSpatiotemporal DataLong-term Time SeriesTemporal ConvolutionSemantic SimilarityGraph StructureVector Autoregressive Model1D quaternion convolution neural networkQuaternionSpatio-temporal networkSpatio-temporal data forecastingQuaternion graph neural network
Abstracts:Spatio-temporal time series forecasting has attracted great attentions in various fields, including climate, power, and traffic forecasting. Recently, Spatio-temporal Graph Neural Networks (STGNNs) have shown promising performances in modeling spatial dependencies based on graph neural networks (GNNs) and temporal dependencies based on temporal learning modules. However, most STGNNs do not effectively integrate explicit and implicit relationships between nodes, nor do they adequately capture long and short-term time dependencies. To address these challenges, this paper presents a Quaternion Spatio-temporal Graph Neural Network (QSTGNN). Specifically, the quaternion spatio-temporal graph is constructed firstly, such that the information of both short and long-term time steps are preserved in quaternion feature tensor, and information of multiple explicit graphs and implicit graph are integrated in quaternion graph adjacency matrix. Then, two modules are designed: a 1D quaternion convolution module and a quaternion graph convolution module. In the 1D quaternion convolution module, complex temporal correlations among short and long-term time steps can be well exploited by 1D quaternion convolution operator based on the quaternion Hamilton product. In the quaternion graph convolution module, quaternion graph convolution is designed to characterize nonlinear dependencies among multiple spatial graphs, including explicit and implicit graphs. Extensive experiments are conducted on six datasets, and the results show that QSTGNN achieves state-of-the-art performances over the existing ten methods. Explainable analysis presents that multiple spatial correlations can accurately illustrate the traffic flow and road functional information in real traffic roads.